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ABSTRACT 

^ ■ In this paper we present results from the weak lensing shape measure- 

ment GRavitational lEnsing Accuracy Testing 2010 (GREATIO) Galaxy 
Challenge. This marks an order of magnitude step change in the level of 
scrutiny employed in weak lensing shape measurement analysis. We pro- 
vide descriptions of each method tested and include 10 evaluation metrics 
over 24 simulation branches. 

GREATIO was the first shape measurement challenge to include vari- 
able fields; both the shear field and the Point Spread Function (PSF) vary 
across the images in a realistic manner. The variable fields enable a variety 
of metrics that are inaccessible to constant shear simulations including a 
direct measure of the impact of shape measurement inaccuracies, and the 
impact of PSF size and ellipticity, on the shear power spectrum. To assess 
■ the impact of shape measurement bias for cosmic shear we present a gen- 

eral pseudo-Cl formalism, that propagates spatially varying systematics in 
cosmic shear through to power spectrum estimates. We also show how one- 
point estimators of bias can be extracted from variable shear simulations. 

The GREATIO Galaxy Challenge received 95 submissions and saw a 
factor of 3 improvement in the accuracy achieved by shape measurement 
methods. The best methods achieve sub-percent average biases. We find a 
strong dependence on accuracy as a function of signal-to-noise, and indica- 
tions of a weak dependence on galaxy type and size. Some requirements for 
the most ambitious cosmic shear experiments are met above a signal-to- 
noise ratio of 20. These results have the caveat that the simulated PSF was 
a ground-based PSF. Our results are a snapshot of the accuracy of current 
shape measurement methods and are a benchmark upon which improve- 
ment can continue. This provides a foundation for a better understanding 
of the strengths and limitations of shape measurement methods. 

Key words: Cosmology: observations, gravitational lensing: weak, meth- 
ods: statistical, techniques: image processing 
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1 INTRODUCTION 

In this paper we present the results from the GRav- 
itational lEnsing Accuracy Testing 2010 (GREATIO) 
Galaxy Challenge. GREATIO was an image analysis 
challenge for cosmology that focused on the task of 
measuring the weak lensing signal from galaxies. Weak 
lensing is the effect whereby the image of a source 
galaxy is distorted by intervening massive structure 
along the linc-of-sight. In the weak field limit this dis- 
tortion is a change in the observed ellipticity of the ob- 
ject, and this change in ellipticity is called shear. Weak 
lensing is particularly important for understanding the 
nature of dark energy and dark matter, because it 
can be used to measure the cosmic growth of struc- 
ture and the expansion history of the Universe (see re- 
views by e.g. Albrecht et al., 2001; Massey, Kitching, 
Richards, 2010; Hoekstra & ,Jain, 2008; Bartclmann & 
Schneider, 2001; Weinberg et al., 2012). In general, by 
measuring the ellipticitios of distant galaxies - here- 
after denoted "shape measurement" - we can make 
statistical statements about the nature of the inter- 
vening matter. The full process through which pho- 
tons propagate from galaxies to detectors is described 
in a previous companion paper, the GREATIO Hand- 
book (Kitching et al., 2011). 

There are a number of features, in the physical 
processes and optical systems, through which the pho- 
tons we ultimately use for weak lensing pass. These 
features must be accounted for when designing shape 
measurement algorithms. These are primarily the con- 
volution effects of the atmosphere and the telescope 
optics, pixelisation effects of the detectors used and 
the presence of noise in the images. The simulations 
in GREATIO aimed to address each of these compli- 
cating factors. GREATIO consisted of two concurrent 
challenges as described in Kitching et al. (2011); the 
Galaxy Challenge, where entrants were provided with 
50 million simulated galaxies and asked to measure 
their shapes and spatial variation of the shear field 
with a known Point Spread Function (PSF) and the 
Star Challenge wherein entrants were provided with 
an unknown PSF, sampled by stars, and asked to re- 
construct the spatial variation of the PSF across the 
field. 

In this paper we present the results of the 
GREATIO Galaxy Challenge. The challenge provided 
a controlled simulation development environment in 
which shape measurement methods could be tested, 
and was run as a blind competition for 9 months from 
December 2010 to September 2011. Blind analysis of 
shape measurement algorithms began with the Shear 
TEsting Programme (STEP; Heymans et al, 2006; 
Massey et al., 2007) and GREAT08 (Bridle et al., 
2009, 2010). The blindness of these competitions is 
critical in testing methods under circumstances that 
will be similar to those encountered in real astronomi- 
cal data. This is because for weak lensing, unlike pho- 
tometric redshifts for example, we cannot observe a 
training set from which we know the shear distribu- 
tion (we can however observe a subset of galaxies at 
high signal-to-noise to train upon, which is something 
we address in this paper). 

The GREATIO Galaxy Challenge is the first 
shape measurement analysis that includes variable 
fields. Both the shear field and the PSF vary across the 



images in a realistic manner. This enables a variety of 
metrics that are inaccessible to constant shear simu- 
lations (where the fields arc a single constant value 
across the images), including a direct measure of the 
impact of shape measurement inaccuracies on the in- 
ferred shear power spectrum and a measure of the 
correlations between shape measurement inaccuracies 
and the size and ellipticity of the PSF. 

We present a general pseudo-Cl formalism for a 
flat-sky shear field in Appendix A, which we use to 
show how to propagate general spatially varying shear 
measurement biases through to the shear power spec- 
trum. This has a more general application in cosmic 
shear studies. 

This paper summarises the results of the 
GREATIO Galaxy Challenge. We refer the reader to 
a companion paper that discusses the GREATIO Star 
challenge (Kitching et al., in prep). Here we summarise 
the results that we show, distilled from the wealth of 
information that we present in this paper: 

(i) Signal-to-noise: We find a strong dependence of 

the metrics below S/N= 10. However we find methods 
that meet requirements for the most ambitious exper- 
iments when S/N> 20. We note that methods tested 
here have been optimised for use on ground based data 
in this regime. 

(ii) Galaxy type: We find marginal evidence that 
model fitting methods have a relatively low depen- 
dence on galaxy type compared to model-independent 
methods. 

(iii) PSF dependence: We find contributions to bi- 
ases from PSF size, but less so from PSF cUijiticity. 

(iv) Galaxy Size: For large galaxies well sampled 
by the PSF, with scale radii > 2 times the mean PSF 
size we find that methods meet requirements on bias 
parameters for the most ambitious experiments. How- 
ever if galaxies are unresolved, with radii < 1 times 
the mean PSF size, biases become significant. 

(v) Training: We find that calibration on a high 
signal-to-noise sample can significantly improve a 
method's average biases. 

(vi) Averaging Methods: We find that averaging cl- 
lipticities over several methods is clearly beneficial, 
but that the weight assigned to each method will need 
to be correctly determined. 

In Section 2 we describe the Galaxy Challenge 
structure, in Section 3 we describe the simulations. 
Results are summarised in Section 4 and we present 
conclusions in Sections 5 and 6. We make extensive 
use of Appendices that contain technical information 
on the metrics and a more detailed breakdown of in- 
dividual shape measurement method's performance. 



2 DESCRIPTION OF THE 
COMPETITION 

The GREATIO Galaxy Challenge was run as an open 

competition for 9 months between 3'^'* December 2010 
and 2"^* September 2011^. The challenge was open for 



1 Between 2"^^ September 2011 and 8"^ September 2011 
we extended the challenge to allow submissions from those 
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participation from anyone, the website^ served as the 
portal for participants, and data could be freely down- 
loaded. 

The challenge was to reconstruct the shear power 
spectrum from subsamplcd images of sheared galaxies 
(Kitching et al. 2011). All shape measurement meth- 
ods to date do this by measuring the ellipticity from 
each galaxy in an imago, although scope for alterna- 
tive approaches was allowed. Participants in the chal- 
lenge were asked to submit either 

(i) Ellipticity catalogues that contained an estimate 
of the ellipticity for each object in each image, or 

(ii) Shear power spectra, that consisted of an esti- 
mate of the shear power spectrum for each simulation 
set. 

Participants were required to access 1 TB of imag- 
ing data in the form of FITS images. Each image con- 
tained 10,000 galaxies arranged on a 100x100 grid. 
Each galaxy was captured in a single postage stamp 
of 48x48 pixels (to incorporate the largest galaxies in 
the simulation with no truncation), and the grid was 
arranged so that each neighbouring postage stamp was 
positioned contiguously i.e. there were no gaps be- 
tween postage stamps and no overlaps. Therefore each 
image was 4800x4800 pixels in size. The simulations 
were divided into 24 sets (see Section 3.1) and each set 
contained 200 images. For each galaxy in each image 
participants were provided with a functional descrip- 
tion of the PSF (described in Section 6) and an image 
showing a pixelised realisation of the PSF. In addition 
a suite of development code was provided to help read 
in the data and perform a simple analysis^ . 



2.1 Summary of metrics 

The metric with which the live leaderboard was scored 
during the challenge was a Quality factor Q, defined 

as 

5 X 10"® 



Q = 1000 



/dln^lCf^-Cf^'TTI^a' 



(1) 



averaged over all sots, a quantity that relates the re- 
constructed shear power spectrum Cf^ with the true 
shear power spectrum Cf^''^~'. We describe this met- 
ric in more detail in Appendices A and B. By eval- 
uating this metric for each submission, results were 
posted to a live leaderboard that ranked methods 
based on the value of Q. We will also investigate a 
variety of alternative metrics extending the STEP m 
and c bias formalism to variable fields. 

The measured ellipticity of an object at position 
can be related to the true ellipticity and shear, 

emeasure(0) = lifi) + eintrinsic(0) 

+ c(0) + m(0)[7(0) + 

c^intrinsic 

m + 

^intrinsic ^intrinsic 

+ en(0), (2) 



participants who had not met the deadline; those submis- 
sions will be labelled in Seetion 4. 
^ http: //www.greatchallenges . info 
^ http : // great . roe . ac . uk/ data/ code/ 



with a multiplicative bias m{0), an offset c(0), and a 
quadratic term q{9) (this is 7I7I, not 7^, since we may 
expect divergent behaviour to more positive and more 
negative shear values for each domain respectively), 
that in general are functions of position due to PSF 
and galaxy properties. en{0) is a potential stochastic 
noise contribution. For spatially variable shear fields, 
biases between measured and true shear can vary as a 
function of position, mixing angular modes and power 
between E and B-modes. In Appendix A, we present 
a general formalism that allows for the propagation 
of biases into shear power spectra using a pseudo-Cl 
methodology; this approach has applications beyond 
the treatment of shear systematics. The full set of met- 
rics are described in detail in Appendix B and are 
summarised in Table 1. 

The metric with which the live leaderboard was 
scored was the Q value, and the same metric was used 
for ellipticity catalogue submissions and power spec- 
trum submissions. However in this paper we will intro- 
duce and focus on Qdn (see Table 1) that for elliptic- 
ity catalogue submissions removes any residual pixel- 
noise error (nominally associated with biases caused 
by finite signal-to-noise or inherent shape measure- 
ment method noise). For details see Appendix B. Note 
that this is not a correction for ellipticity (shape) noise 
which is removed in GREATIO through the implemen- 
tation of a B-mode only intrinsic ellipticity field. 

The metric Q takes into account scatter between 
the estimated shear and the true shear due to stochas- 
ticity in a method or spatially varying quantities, such 
that a small m{0) and c{6) do not necessarily corre- 
spond to a large Q value (see Appendix B). This is 
discussed within the context of previous challenges in 
Kitching et al. (2008). Spatial variation is important 
because the shear and PSF fields vary, so that there 
may be scale-dependent correlations between them, 
and stochasticity is important because we wish meth- 
ods to be accurate (such that errors do not dilute cos- 
mological or astrophysical constraints) as well as being 
unbiased. 

For variable fields we can complement the linear 
biases, m{0) and c(0), with a component that can be 

correlated with any spatially varying quantity X{6), 
for example PSF ellipticity or size; 

m{e)^rno + aX{e), c{0) = Co + PX(e), (3) 

with spatially constant terms mo and Co and corre- 
lation coefficients a and /3. Only ellipticity catalogue 
submissions can have mo, co, a and /3 values calculated 
because these parameters require individual galaxy el- 
lipticity estimates (in order to calculate the required 
mixing matrices, see Appendices A and B). Through- 
out we will refer to m and c as the one-point estima- 
tors of bias and make the distinction between spatially 
constant terms mo and co and correlations a and /3 
only where clearly stated. Finally we also include a 
non- linear shear response (see Table 1), we do not in- 
clude a discussion of this in the main results, because 
97I7I ^ for most methods, but show the results in 
Appendix E. 

To measure biases at the power spectrum level we 
define constant linear bias parameters (see Appendix 
A equation 23) 



(4) 
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Metric 


Deiinition 


Features 


m, c, q 


7 = (1 + m)7* + c + 97*17*1 


One-point estimators of bias. Links to STEP 


Q 


1000-,^ 5x10-6 

J dln<!|(5f«-Cf^'^^K2 


Numerator relates to bias on wq 


Qdn 


1000- ^>CMzl 


Corrects Q for pixel noise 




fdln^lCfg-Cf^'-V-V-^ <J^>j^ 1^2 
J * * realisation ^'object 


jVl ~ + 2m, A oc a{cf 


Cf ^ = Cf ^'TT + ^ + A^Cf ^'TT 


Power spectrum relations. 


ax 
/3x 


m{e) = mo + 
c(6») = CO + /3X(6») 


Variation of m with PSF ellipticity/size 
Variation c with PSF ellipticity/size 



Table 1. A summary of the metrics used to evaluate shape measurement methods for GREATIO. These are defined in 
detail in Appendices A and B. We refer to m and c as the one-point estimators of bias, and make the distinction between 
these and spatially constant terms (mo, co) and correlations (a, f}) only where clearly stated. 



that relate the measured power spectrum to the true 
power spectrum. These are approximately related 
to one-point shear bias m, and the variance of c, 
by ~ m for vahics of rn ^ 1 and a/34 ~ 

cr(c)(27r/i!max)^^^. These parameters can be calculated 
for both ellipticity and power spectrum submissions. 



3 DESCRIPTION OF THE SIMULATIONS 

In this Section we describe the overall structure of the 
simulations. For details on the local modelUng of the 
galaxy and star profiles and the spatial variation of 
the PSF and shear fields we refer to Appendix C. 



3.1 Simulation structure 

The structure of the simulations was engineered such 
that, in the final analysis, the various aspects of 
performance for a given shape measurement method 
could be gauged. The eoinpetition was spHt into sots 
of images, where one set was a 'fiducial' set and the re- 
maining sets represented perturbations about the pa- 
rameters in that set. Each set consisted of 200 images. 
This number was justified by calculating the expected 
pixel-noise effect on shape measurement methods (see 
Appendix B) such that when averaging over all 200 
images this effect should be suppressed (however, see 
also Section 4 where we investigate this noise term 
further) . 

Participants were provided with a functional de- 
scription and a pixelated realisation of the PSF at each 
galaxy position. The task of estimating the PSF itself 
was set a separate 'Star Challenge' that is described 
in a companion paper (Kitching ct al. in prep). 

The variable shear field was constant in each of 
the images within a set, but the PSF field and intrinsic 
ellipticity could vary such that there were three kinds 
of set 

• Type 1, 'Single Epoch', fixed Cf ^, variable PSF, 
variable intrinsic ellipticity. 

• Type 2, 'Multi-Epoch', Fixed Cf^, variable 
PSF, fixed intrinsic ellipticity. 

• Type 3, 'Stable Single Epoch', Fixed Cf ^, fixed 
PSF, variable intrinsic ellipticity. 



The default, fiducial, type being one in which both 
PSF and intrinsic ellipticity vary between images in a 
set. This was designed in part to test the ability of any 
method which took advantage of stacking procedures, 
where galaxy images arc averaged over some popula- 
tion, by testing whether stacking worked when either 
the galaxy or PSF were fixed across images within a 
set or not. Stacking methods achieved high scores in 
GREAT08, Bridle et al. (2010), but in actuality were 
not submitted for GREATIO. For each type of set the 
PSF and intrinsic ellipticity fields are always spatially 
varying but this variation did not change within a set; 
when we refer to a quantity being 'fixed' this means 
that its spatial variation does not vary between images 
within a set. 

Type 1 (variable PSF and intrinsic field) sets test 
the ability of a method to reconstruct the shear field 
in the presence of both a variable PSF field and vari- 
able intrinsic ellipticity between images. This nomi- 
nally represents a sequence of observations of differ- 
ent patches of sky but with the same underlying shear 
power spectrum. Type 2 sets (variable PSF and fixed 
intrinsic field) represent an observing strategy where 
the PSF is different in each exposure of the same patch 
of sky (a typical ground based observation); so called 
'multi-epoch' data. Type 3 sets (fixed PSF) represent 
'single-epoch' observations with a highly stable PSF. 
These were only simple approximations to reality be- 
cause, for example, properties in the individual ex- 
posures for the 'nmlti-epoch' sets were not correlated 
(as they may be in real data), and the signal-to-noise 
was constant in all images for the single and multi- 
epoch sets. Participants were aware of the PSF vari- 
ation from image to image within a set but not of 
the intrinsic galaxy jiropcrties or shear. Thus the con- 
clusions drawn from these tests will be conservative 
with regard to the testing between the different set 
types, relative to real data; where in fact this kind 
of observation is known to the observer ab initio. In 
subsequent challenges this hidden layer of complexity 
could be removed. 

In Appendix D we list in detail the parameter 
values that define each set, and the parameters them- 
selves are described in the sections below. In Table 2 
we summarise each set by listing its distinguishing fea- 
ture and parameter value. There were two additional 
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Set Number 


OCX i\aiiie 


r ixea ror j intrinsic v leia 


Distinguishing Parameter 


1 


Fiducial 


_ 


_ 


2 


Fiducial 


PSF 


— 


3 


Fiducial 


Int 


— 


4 


Low S/N 


— 


S/N= 10 


5 


Low S/N 


PSF 


S/N= 10 


6 


Low S/N 


Int 


S/N= 10 


7 


High S/N Training Data 


— 


S/N= 40 


8 


High S/N 


PSF 


S/N= 40 


9 


High S/N 


Int 


S/N= 40 


10 


Smooth S/N 




S/N distribution Rayleigh 


11 


Smooth S/N 


PSF 


S/N distribution Rayleigh 


12 


Smooth S/N 


Int 


S/N distribution Rayleigh 


13 


Small Galaxy 


_ 


r-6 = 1.8 Td = 2.6 


14 


Small Galaxy 


PSF 


= 1.8 rd = 2.6 


15 


Large Galaxy 


_ 


rt = 3.4 Ti = 10.0 


16 


Large Galaxy 


PSF 


rt = 3.4 ra = 10.0 


17 


Sinootli Galaxy 


_ 


Size distribution Rayleigh 


18 


Smooth Galaxy 


PSF 


Size distribution Rayleigh 


19 


Kolmogorov 




Kolmogorov PSF 


20 


Kolmogorov 


PSF 


Kolmogorov PSF 


21 


Uniform b/d 




b/d fraction [0.3, 0.95] 


22 


Uniform b/d 


PSF 


b/d fraction [0.3, 0.95] 


23 


Offset b/d 




b-d offset variance 0.5 


24 


Offset b/d 


PSF 


b-d offset variance 0.5 



Table 2. A summary of the simulations sets with the parameter or function that distinguishes each set from the fiducial 
one. In the third column we list whether either the PSF or intrinsic ellipticity field (Int) were kept fixed between images 
within a set. rj, and are the scale radii of the bulge and disk components of the galaxy models in pixels, b/d is the ratio 
between the integrated flux in the bulge to disk components of the galaxy models. See Appendix C and D for more details. 



sets that used a pseudo-Airy PSF which we do not 
include in this paper because of technical reasons (see 
Appendix F). 

Training data was provided in the form of a set 
with exactly the same size and form as the other sets. 
In fact the training set was a copy of Set 7, a set 
which contained high signal-to-noise galaxies. In this 
way the structure was set up to enable an assessment 
of whether training on high signal to noise data is use- 
ful when extrapolating to other domains, in particu- 
lar low galaxy signal-to-noisc regime. This is similar 
to being able to observe a region of sky with deeper 
exposures than a main survey. 



3.2 Variable shear and intrinsic ellipticity 
fields 

In the GREATIO simulations the key and unique as- 
pect was that the shear field was a variable quan- 
tity and not a static scalar value (as for all previ- 
ous shape measurement simulations; STEPl, STEP2, 
GREAT08). To make a variable shear field we gener- 
ated a spin-2 Gaussian random field from a ACDM 
weak lensing power spectrum (Hu, 1999) 



(5) 



where Pgs is the matter power spectrum, and the lens- 
ing weight can be expressed as 



where the kernel is 



(6) 



2a(r) 



ar pi(r )- — -. 



(7) 



We have assumed a flat Euclidean geometry through- 
out and vh is the horizon size. Pi{r) refers to the red- 
shift distribution of the lensed sources in redshift bin 
i; this expression can be generalised to an arbitrary 
number (even a continuous set) of redshift bins (sec 
Kitching, Heavens & Miller, 2011). For these simu- 
lations we have a single redshift bin with a median 
redshift of Zm = 1-0 and a delta- function probabil- 
ity distribution Pi{r') = 5^{r — rt). We assume an 
Eisenstcin & Hu (1999) linear matter power spectrum 
with a Smith et al. (2003) non-linear correction. The 
cosmological parameter values used were Qm = 0.25, 
h = Ho/wo = 0.75, Tis = 0.95 and ag = 0.78. In order 
to add a random component to the shear power spec- 
trum, so that participants could not guess the func- 
tional form, we added a series of Legendre polynomials 
Pn {x) up to 5**" order, such that 



^EE,j-f ^ Cf^'-'^ + 2 X 10 



zPn{XL) 



(8) 



where the variable xl = —1 -|- 2{£ — l)/(£max — 1) 
is contained within the range [—1,1] as i varies from 
^min to i?max. The sheax field generated has an E-mode 
power spectrum only. The size of the shear field was 
'.'image = 27r/£niin and to generate the shear field we 
set ^image = 10 degrees, such that the range in i we 
used to generate the power was £ = [36, 3600] from 
the fundamental mode to the grid separation cut-off. 

The shear field is generated on a grid of 100x100, 
which is then converted into an image of galaxy ob- 
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jects via an image generation code'' with galaxy prop- 
erties described in Appendix C. When postage stamps 

of objects arc generated they point-sample the shear 
field at each position, and a postage stamp is gener- 
ated. The postage stamps are then combined to form 
an image. 

Throughout, the intrinsic ellipticity field had a 
variation that contained B-mode power only, as de- 
scribed in the GREATf ffandbook. This meant that 
the contribution from intrinsic ellipticity correlations, 
as well from intrinsic shape noise, to the lensing shear 
power spectra was zero. 



4 RESULTS 

In total the challenge received 95 submissions from 
9 separate teams and 12 different methods in total, 

these were 

• 82 submissions before the deadline, 

• 13 submissions in the post challenge period, 

split into 

• 85 ellipticity catalogue submissions, 

• 10 power spectra submissions. 

We summarise the methods that analysed the 
GREATIO Galaxy Challenge in detail in Appendix 
E. The method that won the challenge, with the high- 
est Q value at the end of the challenge period, was 
'fit2unfold' submitted by the DeepZot team, authors 
D. Kirkby and D. Margala. 

During the challenge a number of aspects of the 
simulations were corrected (we list these in Appendix 
F). Several methods generated low scores due to mis- 
understanding of simulation details, and in this paper 
we summarise only those results for which these errata 
did not occur. In the following we choose the best per- 
forming entry for each of the 12 shape measurement 
method entries. 

4.1 One-point estimators of bias: m and c 
values 

In Appendix B we describe how the estimators for 
shear biases on a galaxy-by-galaxy basis in the sim- 
ulations - what we refer to as 'one-point estimators' 
of biases - can be derived, and how these relate to 
the STEP m and c parameters (Heymans et al. 2006). 
In Figure 1 and in Table 3 we show the m and c bi- 
ases for the best performing entries for each method 
(those with the highest quality factors). In Appendix 
E we show how the m and c parameters, and the dif- 
ference of the measured and true shear 7 — 7*, vary 
for each method as a function of several quantities: 
PSF ellipticity, PSF size, galax;y size, galaxy bulge to 

* To generate the image simulations we used a Monte 
Carlo code that simulates the galaxy model and PSF 
stages at a photon level; this code is a modified 
version of that used for the GREAT08 simulations 
(Bridle et al., 2010). The modified code is available 
here http: //great .roe . ac.uk/data/code/image_code, the 
original code is by Konrad Kuijken, modified by SB and 
SBr for GREAT08, and modified by TDK for GREATIO. 



disk fraction and galaxy bulge to disk angle offset. 
We show in Appendix E that some methods have a 

strong m dependence on PSF ellipticity and size (e.g. 
TVNN and method04). Model fitting methods (gfit, 
im3shape) tend to have fewer model-dependent biases, 
whereas the KSB-hke methods (DEIMOS, KSB f90) 
have the smallest average biases. 

4.2 Variable shear 

In the lefthand panel of Figure 2 we show the values 
of the linear power spectrum parameters A4 and A for 
each method for each set, and display by color code the 
Quality factor Qdn- In Table 3 we show the mean val- 
ues of these parameters averaged over all sets. We find 
a clear anti-correlation between A4 and A and Qdn, 
with higher Quality factors corresponding to smaller 
A4 and A values. We will explore this further in the 
subsequent sections. We refer the reader to Appendix 
B where we show how the A4, A and Qdn parame- 
ters are expected to be related in an ideal case. In the 
righthand panel of Figure 2 we also show the M., A 
and Qdn values for each method averaged over all sets. 

In the lefthand panel of Figure 3 we show the 
effect that the pixel noise denoising step has on the 
Quality factor, Q. Note that the way that the de- 
noising step is implemented here uses the variance 
of the true shear values (but not the true shear val- 
ues themselves). This is a method that was not avail- 
able to power spectrum submissions and indeed part 
of the challenge was to find optimal ways to ac- 
count for this in power spectrum submissions. The 
final layer used to generate the 'fit2-unfold' submis- 
sion performed power-spectrum estimation and used 
the model-fit errors themselves to determine and sub- 
tract the variance due to shape measurement errors, 
including pixel noise. We find as expected that Q in 
general increases for all methods when pixel noise is 
removed, by a factor of < 1.5, such that a method 
that has Q ~ 100 has a Qdn — 150. When this cor- 
rection is applied the method 'fit2-unfold' still obtains 
the highest Quality factor, and the ranking of the top 
five methods is unaffected. 

4-2.1 Training 

Several of the methods used the training data to help 
debug and test code. For example, and in particular, 
'fit2-unfold' used the data to help build the galaxy 
models used and to set initial parameter values and 
ranges in the maximum likelihood fits. This meant 
that 'fit2-unfold' performed particularly well in sets 
similar to the training data (sets 7, 8, and 9) at high 
signal-to-noise; for details see Appendix D Figure 23, 
where 'fit2-unfold' has smaller combined A4 and A 
values than any other method for some sets. 

To investigate whether using high signal-to-noise 
training data is useful for methods we investigate a 
scenario that training on the power spectra had been 
used for all methods. This modification was poten- 
tially available to all participants if they chose to im- 
plement it. To do this we measure the M and A values 
from the high signal-to-noise Set 7 (see Table 2) and 
apply the transformation to the power spectra, which 
is to first order equivalent to an m and c correction, 
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Method 


Q 


Qdn 


Qdn & trained 




m 


c/10-4 


M/2 




tARES 50/50 


125.42 


215.09 


314.97 


-0 


026483 


0.35 


-0.035748 


0.46 


tcat7unfold2 (ps) 


190.22 




188.06 








-0.013722 


0.44 


DEIMOS C6 


67.71 


149.88 


315.59 





006554 


0.08 


0.012131 


0.39 


fit2-unfold (ps) 


288.77 




304.11 








-0.046022 


0.40 


gfit den cs 


124.65 


253.99 


242.55 





007611 


0.29 


0.010557 


0.36 


KSB 


119.48 


174.16 


220.48 


-0 


059520 


0.86 


-0.062690 


0.57 


*KSR fQfl 

J. V o j_) lijyj 


58.29 


148.83 




— 


UKJOOO ^ 


0.19 


n nni i ^^7 

U. UU -L J-O ( 


0.50 


*im3shape NBCO 


99.71 


146.57 


236.93 


-0 


049982 


0.12 


-0.080900 


0.61 


tMegaLUTsim2.1 b20 


90.72 


99.79 


166.34 


-0 


265354 


-0.55 


-0.233981 


0.87 


method 4 


106.85 


119.85 


158.73 


-0 


174896 


-0.12 


-0.120779 


0.64 


tNN23 func 


104.76 


70.46 


11.43 


-0 


239057 


0.47 


-0.029440 


0.61 


shapefit 


47.00 


88.75 


192.46 





108292 


0.17 


-0.028606 


0.52 



Table 3. The QuaUty factors, Q, with denoising and training, and the m and c values for each method (not available 
for power spectrum submissions) that we explore in detail in this paper, in alphabetical order of the methods name. A 
"(ps)" indicates a power spectrum submission, in these cases Q^^ & trained = Qtrainedi a^U others were ellipticity catalogue 
submissions. An * indicates that this team had knowledge of the internal parameters of the simulations, and access to the 
image simulation code. A t indicates that this submission was made in the post-challenge time period. 
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Figure 1. In the lefthand panel we show the multiplicative m and additive c biases for each ellipticity catalogues method, 
for which one-point estimators can be calculated, see Appendix B. The symbols indicate the method with a legend in the 
righthand panel. The central panel expands the x- and y-axes to show the best performing methods. 



Ce 



.4set = 7 



1 + A1set = 7 



(9) 



to calibrate the method using the training data. In 
Figure 3 we show the resulting Quality factors where 
we apply both a denoising step and a training step 
and when we apply a training step only. When both 
steps are applied we find that Quality factor improves 
by a factor > 2 and some methods perform as well as 
the 'fit2-unfold' method (if not better). In particular 
'DEIMOS C6' achieves an average Quality factor of 
316 (see Table 3). We find that the increase in the 
quality factor is uniform over all sets, including the 
low signal-to-noise sets. 

We conclude that it was a combination of model 
calibration on the data, and using a denoised power 
spectrum, that enabled 'fit2-unfold' to win the chal- 
lenge. We also conclude that calibration of mea- 
surements on high signal-to-noise samples, i.e. those 
that could be observed using a deep survey within a 
wide/deep survey strategy, is an approach that can 
improve shape measurement accuracy by about a fac- 
tor of two. Note that using this approach is not doing 
shear calibration as it is practiced historically because 



the true shear is not known. This holds as long as the 
deep survey is a representative sample and the PSF 
of the deep data has similar properties to the PSF in 
the shallower survey. 



4-2.2 Multi- epoch data 

In Figure 3 we show how Qdn varies for each submis- 
sion averaged over all those sets that had a fixed intrin- 
sic eUipticity field (Type 2) or a fixed PSF (Type 3), 
described in Section 3.1. Despite the simplicity of this 
implementation we find that for the majority of meth- 
ods, this variation, corresponding to multi-epoch data, 
results in an improvement of approximately 1.1 to 1.3 
in Qdn, although there is large scatter in the relation. 
In GREATIO the coordination team made a decision 
to keep the labelling of the sets private, so that partic- 
ipants were not explicitly aware that these particular 
sets had the same PSF (although the functional PSFs 
were available) or the same intrinsic ellipticity field. 
These were designed to test stacking methods, how- 
ever no such methods were submitted. The approach 
of including this kind of subset can form a basis for 
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Figure 2. In the lefthand panel we show M and A for each method for each set. The colour scale represents the logarithm 
of the quality factor Qdn- In th^ righthand panel we show the metrics M, A and Qdn for each method averaged over all 
sets. For a breakdown of these into dependence on set type see Figure 4. 




Q Q(Typel) 



Figure 3. In the lefthand panel we show the un-modified quality factor Q (equation 1) and how this relates to the quality 
factor with pixel (shape measurement) noise removed Q^^ and the quality factor obtained when high signal to noise training 
is applied to each submission (equation 9). Methods that submitted power spectra could not be modified to remove the 
denoising in this way, so only show the training values are shown. The righthand panel shows the Q^n for those sets with 
fixed intrinsic ellipticities ('multi-epoch'; Type 2) or a fixed PSF ('stable single epoch'; Type 3) over all images compared 
to the quality factor in the variable PSF and intrinsic ellipticity case ('single epoch'; Type 1). 



further investigations. 



4-2.3 Galaxy signal-to-noise 



As a summary we show in Figure 4 how the pop- 
ulation of Al, .4 and Qdn parameters for each of the 
quantities that were varied between the sets, for all 
methods (averaging over all the other properties of 
the sets that are kept constant between these varia- 
tions). In the following Sections we will analyse each 
behaviour in detail. 



In the top row of Figure 5 we show how the metrics 
for each method change as a function of the galaxy 
signal-to-noise. We find a clear trend for all methods 
to achieve better measurements on higher signal-to- 
noise galaxies; with higher Q values and a smaller mul- 
tiplicative and additive biases M and A. In particu- 
lar 'fit2-unfold', 'cat2-unfold', 'shapefit' and 'KSB f90' 
have a close to zero multiplicative bias for S/N > 20. 
Because signal-to-noise has a particularly strong im- 
pact we tabulate the Mi and A values in Table 4. We 
also show in the lower row of Figure 5 the breakdown 
of the multiplicative and additive biases into the com- 
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S/N=40 
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Figure 4. In each panel we show the metrics, A4, A and Qdni for each of the parameter variations between sets, for each 
submission; the colour scale labels the logarithm of Qdn as show in the lower right. The first row shows the signal-to-noise 
variation, the second row shows the galaxy size variation, the third row shows the galaxy model variation (the galaxy models 
are: uniform bulge-to-disk fractions where each galaxy has a b/d ratio randomly sampled from the range b/d= [0.3, 0.95] 
with no offset (Uniform B/D No Offset), a 50% bulge-to-disk fraction b/d= 0.5 with no offset (50/50 B/D No Offset) and 
a 50% bulge-to-disk fraction b/d= 0.5 with a bulge/disk centroid offset (50/50 B/D Offset)), the fourth row shows PSF 
variation with and without Kolmogorov (KM) PSF variation. 
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Figure 5. In the top panels we show how the metrics, M, A. and Qdn for submissions change as the signal-to-noise increases; 
the colour scale labels the logarithm of Qdn- In the lower panels we show the PSF size and ellipticity contributions a and 
/3. In the bottom-lefthand panel we show the key that labels each method. 





S/N=10 




S/N=20 




S/N=40 




Method 




V^/10-4 


MI2 


V^/10-4 






tARES 50/50 


0.087731 


0.935697 


0.049419 


0.407512 


0.008458 


0.209666 


tcat7unfold2 (ps) 


0.058300 


0.722149 


0.007312 


0.339832 


-0.003201 


0.126659 


DEIMOS C6 


0.047518 


0.852336 


0.021164 


0.392274 


-0.014486 


0.094460 


fit2-unfold (ps) 


-0.162941 


0.565951 


-0.001792 


0.428405 


-0.003452 


0.082378 


gfit 


0.085774 


0.817820 


-0.013417 


0.290786 


-0.015567 


0.146713 


KSB 


0.163389 


1.225922 


0.058562 


0.412507 


0.029040 


0.290177 


*KSB f90 


0.081783 


1.007300 


0.011454 


0.421214 


0.005595 


0.199879 


*im3shape NBCO 


0.177015 


1.141387 


0.084891 


0.469249 


0.042984 


0.378769 


tMegaLUTsim2.1 b20 


0.527131 


1.591224 


0.177168 


0.785525 


0.203427 


0.758546 


method 4 


0.206427 


1.205432 


0.110811 


0.545012 


0.082873 


0.322588 


tNN23 func 


0.074866 


0.608485 


-0.005973 


0.640604 


-0.083064 


0.899982 


shapcfit 


-0.104730 


1.095732 


0.021622 


0.421100 


0.001374 


0.196678 



Table 4. The metrics JVl/2 ~ m and \/v4 ~ (t(c) for each of the signal-to-noise values used in the simulations. 



ponents that are correlated with the PSF size and el- 
hpticity (see Table 1). We find that for the methods 
with the smallest biases at high signal-to-noise (e.g. 
'DEIMOS', 'KSB f90', 'ARES') the contribution from 
the PSF size is also small. For all methods we find 
that the contribution from PSF ellipticity correlations 
is subdominant. 



4-2.4 Galaxy size 

In Figure 6 we show how the metrics of each method 
change as a function of the galaxy size - the mean PSF 
size was ~ 3.4 pixels. Note that the PSF size is statis- 
tically the same in each set, such that a larger galaxy 
size corresponds to either a case where the galaxies 
are larger in a given survey or where observations are 



taken where the pixel size and PSF size are relatively 
smaller for the same galaxies. 

We find that the majority of methods have a weak 
dependency on the galaxy size, but that at scales of 
< 2 pixels, or size/mean PSF size ~ 0.6, the accuracy 
decreases (larger M and A and smaller Qdn). This 
weak dependence is partly due to the small (but re- 
alistic) dynamical range in size, compared to a larger 
dynamical range in signal-to-noise. The exceptions are 
'cat7unfold2', 'fit2unfold' and 'shapefit' that appear to 
perform very well on the fiducial galaxy size and less 
well on the small and large galaxies - this is consistent 
with the model calibration approach of these meth- 
ods, which was done on Set 7 that used the fiducial 
galaxy type. The PSF size appears to have a small 
contribution at large galaxy sizes, as one should ex- 
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Figure 6. In the top panels we show how the metrics, M, A and Qdn for submissions change as the galaxy size increases; 
the colour scale labels the logarithm of Qdn- In the lower panels we show the PSF size and ellipticity contributions a and 
/3. In the bottom-lefthand panel we show the key that labels each method. The mean PSF is the mean within an image not 
between all sets. 




pect, but a large contribution to the biases at scales 
smaller than the mean PSF size. We find that the 
methods with largest biases have a strong PSF size 
contribution. Again the PSF ellipticity has a subdom- 
inant contribution to the biases for all galaxy sizes. 



4-2.5 Galaxy model 

In Figure 7 we show how each method's metrics 
change as a function of the galaxy type. The major- 
ity of methods have a weak dependency on the galaxy 
model. The exceptions, similar to the galaxy size de- 
pendence, are 'cat7unfold2', 'fit2unfold' and 'shapefit' 
that appear to perform very well on the fiducial galaxy 
model and less so on the small and large galaxies - this 
again is consistent with model calibration approach of 
these methods. Again the contribution from the PSF 
size dependence is dominant over the PSF ellipticity 
dependence, and is consistent with no model depen- 
dency for the majority of methods, except those high- 
lighted here. We refer to Section 4.4 and Appendix E 
for a breakdown of m and c behaviour as a function 
of galaxy model for each method. 

4.2.6 PSF model 

In Figure 8 we show the impact of changing the PSF 
spatial variation on the metrics for each method. We 



show results for the fiducial PSF, which does not in- 
clude a Kolmogorov (turbulent atmosphere) power 
spectrum, and one which includes a Kolmogorov 
power spectrum in PSF ellipticity. We find that the 
majority of methods have a weak dependence on the 
inclusion of the Kolmogorov power. But it should be 
noted that participants knew the local PSF model ex- 
actly in all cases. 

For the Kolmogorov power the PSF size depen- 
dence has a similar order of magnitude to the PSF 
ellipticity dependence for the additive bias A. This is 
different to the other set-dependencies. This can be 
explained by the fact that the spatial ellipticity varia- 
tion in the other sets is lower than in the Kolmogorov 
sets. 



4.3 Averaging methods 

In order to reduce shape measurement biases one may 
also wish to average together a number of shape mea- 
surement methods. In this way any random compo- 
nent, and any biases, in the ellipticity estimates may 
be reduced. In fact the 'ARES' method (see Appendix 
E) averaged catalogues from DEIMOS and KSB and 
attained better quality metrics. Doing this exploited 
the fact that DEIMOS had in some sets a strong re- 
sponse to the ellipticity whereas KSB had a weak re- 
sponse. 

To test this we averaged the ellipticity catalogues 
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Figure 7. In the top panels we show how the metrics, M, A and Qdn for submissions change as the galaxy model changes; 
the colour scale labels the logarithm of Qdni the galaxy models are: uniform bulge-to-disk fractions each galaxy has a b/d 
ratio randomly sampled from the range b/d = [0.3,0.95] with no offset (Uni.), a 50% bulge-to-disk fraction b/d = 0.5 with 
no offset (50/50.) and a 50% bulge-to-disk fraction b/d = 0.5 with a bulge/disk centroid offset (w/O). In the lower panels 
we show the PSF size and ellipticity contributions a and /3. In the bottom-lefthand panel we show the key that labels each 
method. 



X 10 



2.5 



Si 2 



1.5 




Plus KM 



No KM 



PSF Type 



■ KSB 

■ method 4 

■ imSshape NBCO 

■ DEIMOS C6 

■ KSB 190 

■ ARES 50/50 

■ gfit 

■ shapefit 
NN23 tunc 

■ fit2-untold (ps) 

■ cat7unfold2 (ps) 



0.2 



0.15 



0.1 



0.05 



Plus KM 



0.2 



No KM 



PSF Type 



0.15 



s 0.1 



0.05 _ 



PSF ellipticity 
PSF size? 



Plus KM 



No KM 



PSF Type 




Plus KM 



No KM 



PSF Type 



=1 10 




Plus KM 



No KM 



PSF Type 



Figure 8. In the top panels we show how the metrics, M, A and Q^^ for submissions change as the PSF model changes; the 
colour scale labels the logarithm of Qdni the PSF models are the fiducial PSF, and the same PSF except with a Kolmogorov 
power spectrum in ellipticity added. In the lower panels we show the PSF size and ellipticity contributions a and /3. In the 
bottom-lefthand panel we show the key that labels each method. 
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Figure 9. The Quality factor as a function of signal-to-noisc (left panel), galaxy size (middle panel) and galaxy type (right 
panel) for an averaged ellipticity catalogue submission (red, using the averaging described in Section 4.3); compared to the 
methods used to average (black). 



from the entries with the best metrics for each method 
that submitted an eUipticity catalogue (ARES 50/50, 
DEIMOS C6, gfit, imSshape NBCO, KSB, KSB f90, 
MegaLUTsim2.1 b20, method 4, shapefit) as so: 



^-^methods ' 



(10) 



where i labels each galaxy and in general Wm,i is 
some weight that depends on the method, galaxy and 
PSF properties. We wish to weight methods that per- 
form better, and so choose the Quality factor from the 
high signal-to-noise training set (set 7) as the weight 
Wm,i = Qdn,m(set 7) applied over all other sets. This 
is close to an inverse variance weight on the noise in- 
duced on the shear power spectrum (oc l/crtys)- We 
leave the determination of optimal weights for future 
investigation. 

We find that the average Quality factors over all 
sets for this approach are Q — 131 and Qdn = 210, 
which are slightly smaller on average than some of the 
individual methods. However we find that for the fidu- 
cial signal-to-noise and large galaxy size the Quality 
factor increases, see Figure 9. This suggests that such 
an averaging approach can improve the accuracy of an 
ellipticity catalogue but that a weight function should 
be optimised to be a function of signal-to-noise, galaxy 
size and type; however averaging many methods with 
a similar over or under estimation of the shear would 
not improve in the combination. If we take the highest 
quality factors in each set, as an optimistic case that 
a weight function had been found that could identify 
the best shape measurement in each regime we find 
an average Qdn = 393. 



4.4 Overall performance 

We now list some observations of method accuracy 
for each method by commenting on the behaviour of 
the metrics and dependencies shown in Section 4 and 
Appendix E. Words such as 'relative' are with respect 
to the other methods analysed here. This is a snapshot 
of methods performance as submitted for GREATIO 
blind analysis. 

• KSB; has low PSF ellipticity correlations, and a 
small galaxy morphology dependence, however it has 
a relatively large absolute m bias value. 



• KSB f90; has small relative m and c biases on 
average, but a relatively strong PSF size and galaxy 
morphology dependence, in particular on the galaxy 
bulge fraction. 

• DEIMOS; has small m and c biases on average, 
but a relatively strong dependence on galaxy morphol- 
ogy again in particular on the bulge fraction, similar to 
KSB f90. Dependence on galaxy size is low except for 
small galaxies with size smaller than the mean PSF. 

• imSshape; has a relatively large PSF ellipticity 
and size correlation, a small galaxy size dependence 
for m and c but a stronger bulge fraction dependence. 

• gfit; has relatively small average m and c biases, 
and a small galaxy morphology dependence, there is a 
relatively large correlation with PSF ellipticity. This 
was the only method to employ a denoising step at 
the image level, suggesting that this may be partly 
responsible for the small biases. 

• method 4; has relatively strong PSF ellipticity, 
size and galaxy type dependence. 

• flt2unfold; has strong model dependence, but 
relatively small m and c biases for the fiducial model 
type, and also a relatively low PSF ellipticity correla- 
tion. 

• cat2unfold; has strong model dependence in 
particular on galaxy size, but relatively small m and c 
biases for the fiducial model type, and also a relatively 
low PSF ellipticity correlation. 

• shapefit; has a relatively low quality factor, and 
a strong dependence on model types and size that are 
not the fiducial values, but small m and c biases for 
the fiducial model type. 

To make some general conclusions, we find that 

(i) Signal-to-noise; We find a strong dependence of 
the metrics below S/N= 10, however we find methods 
that meet requirements for the most ambitious exper- 
iments when S/N> 20. 

(ii) Galaxy type; We find marginal evidence that 
model fitting methods have a relatively low depen- 
dence on galaxy type compared to KSB-like methods, 
but that this is only true if the model matches the un- 
derlying input model (note that GREATIO used sim- 
ple models). We find evidence that if one trains on a 
particular model then biases are small for this subset 
of galaxies. 

(iii) PSF dependence; Despite the PSF being 
known exactly we find contributions to biases from 
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Figure 10. The cumulative submission number as a func- 
tion of the challenge time, which started on 3"^ December 
2010 and ran for 9 months. 

PSF size, but less so from PSF ellipticity. The meth- 
ods with the largest biases have a strong PSF size 
correlation. 

(iv) Galaxy Size: For large galaxies well sampled 
by the PSF, with scale radii > 2 times the mean PSF 
size we find that methods meet requirements on bias 
parameters for the most ambitious experiments. How- 
ever if galaxies are unresolved with radii < 1 time the 
PSF size biases become significant. 

(v) Training: We find that calibration on a high 
signal-to-noise sample can significantly improve a 
method's average biases. This is true whether training 
is a model calibration, or a more direct form of train- 
ing on the ellipticity values of power spectra them- 
selves. 

(vi) Averaging Methods: We find that averaging 
methods is clearly beneficial, but that the weight as- 
signed to each method will need to be correctly deter- 
mined. An individual entry (ARES) found that this 
was the case, and we find similar conclusions when 
averaging over all methods. 



5 ASTROCROWDSOURCING 

The GREATIO Galaxy Challenge was an example of 
'crowdsourcing' astronomical algorithm development 
('astrocrowdsourcing'). This was part of a wider effort 
during this time period, that included the GREATIO 
Star Challenge and the sister project Mapping Dark 
Matter^ (see companion papers for these challenges). 
In this Section we discuss this aspect of the challenge 
and list some observations 

GREATIO was a major success in its effort to gen- 
erate new ideas and attract new people into the field. 
For example, the winners of the challenge (authors D. 
Kirkby and D. Margala), were new to the field of grav- 
itational lensing. A variety of entirely new methods 
have also been attempted for the first time on blind 
data, including the Look Up Table (MegaLUT) ap- 
proach, an auto-correlation approach (method 4 and 
TVNN), and the use of training data. Furthermore 
the TVNN method is a real pixel-level deconvolution 

^ Run in conjunction with Kaggle 

http : //www. kaggle . com/c/mdm 



method, which is the first time a genuine deconvolu- 
tion of the data has been used in shape measurement. 

The limiting factor in designing the scope of the 
GREATIO Galaxy Challenge was the size of the sim- 
ulations, that was kept below 1TB for ease of distri- 
bution; a larger challenge could have addressed even 
more observational regimes. In the future executables 
could be distributed that locally generate the data. 
However in this case participants may still need to 
store the data. Another approach might be to host 
challenges on a remote server where participants can 
upload and run algorithms. Care should be taken how- 
ever to retain the integrity of the blindness of a chal- 
lenge, without which results become largely meaning- 
less as methods could be tuned to the parameters or 
functions of specific solutions if those solutions are 
known a priori. We require algorithms to be of high 
fidelity and to be useful on large amounts of data, 
which requires them to be fast: an algorithm that takes 
a second per galaxy needs ~ 50 CPU years to run on 
1.5x10^ galaxies (the number observable by the most 
ambitious lensing experiments e.g. Euclid^, Laureijs 
et. al., 2011), a large simulation generates innovation 
in this direction. 

In Figure 10 we show the cumulative submission 
of the GREATIO Galaxy Challenge as a function of 
time, from the beginning of the challenge to the end 
and in the post-challenge submission period. All sub- 
missions (except one made by the GREATIO coordi- 
nation team) were made in the last 3 weeks of the 
9 month period. For future challenges intra-challenge 
milestones could be used to encourage early submis- 
sions. This submission profile also reflects the size and 
complexity of the challenge; it took time for partici- 
pants to understand the challenge and to run algo- 
rithms over the data to generate a submission. For fu- 
ture challenges submissions on smaller subsets of the 
data could be enabled, with submission over the entire 
data set being optional. 

We note that the winning team (Kirkby and Mar- 
gala) made 18 submissions during the challenge, com- 
pared to the mean submission number of 9. The win- 
ners also recognised from the information provided 
that the submission procedure was open to power 
spectrum and ellipticity catalogue submissions. The 
leaderboard was designed such that accuracy was re- 
ported in a manner that was indicative of perfor- 
mance, but such that this information could not be 
trivially used to directly calibrate methods (for ex- 
ample if m and c were provided a simple ellipticity 
catalogue correction could have been made). 

Many of these issues were overcome in the sis- 
ter Mapping Dark Matter challenge (see the Mapping 
Dark Matter results paper, Kitching et al., in prep) 
that received over 700 entries, over 2000 downloads of 
the data and a constant rate of submission. It also used 
an alternative model for leaderboard feedback where 
the simulated data was split into public and private 
sets, and useful feedback only provided for the public 
sets. 



http : //www . euclid-ec . org 
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6 CONCLUSIONS 

The GREATIO Galaxy Challenge was the first weak 
lensing shear simulation to include variable fields: 
both the PSF and the shear field varied as a function 
of position. It was also the largest shear simulation to 
date, consisting of over 50 million simulated galaxies, 
and a total of 1TB of data. The challenge ran for 9 
months from December 2010 to September 2011, and 
during that time approximately 100 submissions were 
made. 

In this paper we define a general pseudo-Cl 
methodology for propagating shape measurement bi- 
ases into cosmic shear power spectra and use this to 
derive a series of metrics that we use to investigate 
methods. We present a quality factor Q that relates 
the inaccuracy in shape measurement methods to the 
shear power spectrum itself. A Q = 1000 denotes a 
method that could mcEisure the dark energy equation 
of state parameter wo with a bias less than or equal 
to the predicted statistical error from the most am- 
bitious planned weak lensing experiments (for a more 
general expression we refer to Massey et al., in prep). 
We show how one can correct such a metric to account 
for pixel noise in a shape measurement method. Dur- 
ing the challenge, submissions wore publicly ranked on 
a live leaderboard and ranked by this metric Q. 

We show how a variable shear simulation can be 
used to determine rn and c parameters (Heymans et 
al., 2006) that arc a measure of bias between the mea- 
sured and true shear (those parameters used in con- 
stant shear simulations: STEP and GREAT08) on an 
object by object basis. We link the quality factor to 
linear power spectrum biases including a multiplica- 
tive M ~ 2m and additive bias A oc cr(c)^ that are 
approximately related to the STEP one-point estima- 
tors of shape measurement bias. The equality is only 
approximate because in general M and A are a mea- 
sure of spatially varying method biases. We introduce 
further metrics that allow an assessment of the contri- 
bution to the multiplicative and additive biases from 
correlations between the biases and any spatially vari- 
able quantity (in this paper we focus on PSF size and 
ellipticity). 

The simulations were divided into sets of 200 im- 
ages each containing a grid of 10,000 galaxies. In each 
set the shear field was spatially varying but constant 
between images. The challenge was to reconstruct the 
shear power spectrum for each set. Participants could 
submit either catalogues of ellipticities one per image 
or power spectra one for each set, and were provided 
with an exact functional description of the PSF and 
the positions of all objects to within half a pixel. 

The simulations were structured in such a way 
that conclusions could be made about a shape mea- 
surement method's accuracy as a function of galaxy 
signal-to-noise, galaxy size, galaxy model/type and 
the PSF type. The simulations also contained some 
'multi-epoch' sets in which the shear and intrinsic el- 
lipticities were fixed between images in a set but where 
the PSF varied between images, and some 'static 
single-epoch' sets where the PSF was fixed between 
images in a set but the intrinsic ellipticity field var- 
ied between images. All fields were always spatially 
varying. Participants were provided with true shears 



for one of the high signal-to-noise sets that they could 
use as a training set. 

Despite the simplicity of the challenge, making 
conclusions about which aspects of which algorithm 
generated accurate shape measurement is difficult due 
to the complexity of the algorithms themselves (see 
Appendix E) . We leave investigations into tunable as- 
pects of each method to future work. We can how- 
ever make some statements about the regimes in which 
methods perform well or poorly. 

The best methods submitted to GREATIO scored 
an average Q ~ 300 with m ~ 7 x 10~^ and c ~ 10~^, 
this is approximately a factor 3 improvement over pre- 
vious performance on blind simulations (the best per- 
forming non-stacking method at a signal-to-noise 20, 
using the GREATIO definition, in GREAT08 was CH 
which had an m = 0.0095 ± 0.003 and c ~ 8x10"^). 
The methods that won the challenge (scoring the high- 
est Q on the leaderboard) employed a maximum likeli- 
hood model-fitting method. Several methods used the 
training data to test code, and we find that by directly 
training on a high signal-to-noise set the majority of 
methods achieve a factor of 2 increase in the average 
value of Q. We find some evidence that shape measure- 
ment inaccuracies can be reduced by averaging meth- 
ods together, but conclude that for such a method to 
be usable an optimal weight for each method as a func- 
tion of signal-to-noise and galaxy properties would 
have to be found. 

For a signal-to-noise of 40 the best methods 
achieved a Q > 1000, rn < lxlO~^ and c < IxlO"'^; the 
majority of methods have an accuracy that is strongly 
dependent on signal-to-noise with Q ~ 100 and ~ 50 
for signal-to-noise of 20 and 10 respectively. However 
the dependence on galaxy model (bulge-to-disk ratio 
or bulge-to-disk offset) and size is not strong. There is 
a contribution to the multiplicative bias m from PSF 
size correlations for the majority of methods over all 
sets, but a smaller contribution from PSF ellipticity 
dependence (as expected from theoretical calculations, 
e.g. Massey et al., in prep). 

The testing of shape measurement methods by 
GREATIO suggests methods now exist that can be 
used for cosmic shear surveys covering up to a few 
thousand square degrees (<3000 square degrees, that 
require m < 6x10"^; Kitching et al., 2008'^) to mea- 
sure cosmological parameters in an unbiased fashion. 
We find that on the additive bias c methods already 
meet requirements for even the most ambitious sur- 
veys (c < 1x10^**) over all simulated conditions, and 
that in the high signal-to-noise regime (> 40) meth- 
ods already meet the most ambitious requirements on 
the multiplicative bias (m < 2xl0~^; Kitching et al., 
2008) suggesting that such accuracy is possible for, 
or can at least be extrapolated down to, lower signal- 
to-noise, in principle. However we note that the re- 
quirements are on all galaxies in a survey and that 
the demonstration here is averaged over a simulation 
with particular properties, in particular the fiducial 

^ The scaling formula from this paper can be rewritten for 
the maximum applicable area of a survey for a given bias 
m as Anax < 20,000[(0.001/m)2-4/0.17/10'3]l/l.5 

square 

degrees, assuming that the redshift behaviour is m oc (1 -|- 
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signal-to-noise is 20. Therefore these conclusions have 
a caveat that the GREATIO simulations were inten- 
tionally simplistic in some respects, so that clear state- 
ments about methods could be made, but they provide 
a foundation for shape measurement development to 
continue increasing in realism and complexity. 
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APPENDIX A: PSEUDO-CL ESTIMATORS FOR WEAK LENSING 

In this Section we describe a general formalism for the evaluation of variable shear systematics in weak lensing. 
We note that this has a more general apphcation to that described here, such that any mask in general could be 

accounted for in weak lensing power spectrum estimation. This closely follows the pseudo-Cl formalism described 
in Memari (2010) and Brown et al. (2005) that has been applied in CMB studies, for survey masks. 
We start by defining a generalised shear systematic response where 

Cmeasure (0) = l{0) + Cintrinsic (0) + c(0) + m(0) [7(6) + Cintrinsic (0)] + q{0) [l{0) + eintrinsic {0)] \l{0) + Cintrinsic (0) | (H) 

where all variables are a function of position on the sky, and all are complex quantities (e.g. 7(6) = Ji(G) + 
172(0)). We expect that Tn{6) will in general depend on spatially varying quantities including PSF ellipticity 
and size or galaxy properties such as signal-to-noise, so that one could write rn{6) — > m(PSF(0), Galaxy(0)) or 
"^(epsp(0), rpsF(0), S/N(0), . . .) for example, but this does not qualitatively change the following treatment. We 
note also that in general the systematic terms can also be complex m{6) = |m(0)|e"^'*', here we assume a scalar 
spatially varying quantity, and will investigate further generalisation in future work. 

The E and B mode decomposition of the spin-2 field emeasure(0) can be written in general as a rotation in 
Fourier space (see GREATIO handbook) such that 

E{e)±iB{e) = £*£\£\-'[ei 

, measure IC2, measure 

E{£)±iB{£) = e^^'^^ [ei,„,easurc(^) +ie2,measurc(^)] (12) 

where emeasure(-^) is the Fourier transform of emeasure(0)- 

When creating a power spectrum the auto-correlations of the first three terms of equation (11) have a simple 

interpretation, but the fourth term has an effective weight map as a function of position such that (only focusing 
on the contribution from the fourth term) we have that the estimated E and B-mode terms are 

E{e) ± iB{l) = j ^e'^^'^'Wraif. - £') [E{£') ± iB{£')] , (13) 

where Wm is 2D Fourier transform of the the m{0) field. Equivalently for the E-mode part only we have 

m = J ^Wra{l - e!) [cos{2{ct>, - 4>t'))E{t') - sin(2(0, - cl>e))B{l')\ , (14) 

where this equation has the interpretation of a rotation of E and B to ellipticity in Fourier space, a convolution 

with the window/weight function and then a rotation back to E and B. We now wish to compute the effect that 
the weight map has on the E-mode power. In Fourier space the auto and cross power are defined as 

{Xi{l)X;{i')) = (27r)'cf '<5^(£ - l') (15) 

where isotropy of the field is assumed. This means that an unbiased estimator can be written in the flat sky limit 
as an average over angle in ^-space 



(c^f 0= / ;S(^iW^;(0>- (16) 



(2^) 

Hence by taking the correlation function of equation (14) we can calculate the estimated power spectrum in the 
presence of a systematic weight map. This follows the calculations of Memari (2010), the resulting expressions for 
the EE power and BB power are below, and we include the EB expression for completeness (however in the flat 
sky limit there is no EE, BB and EB mixing; there is between EE and BB though) 

- /|^{/dLL^^([l+cos4,](C;r) + [l-cos4,](Cn) 

'-iEB\ / Q< I I JT T 



= / (i^ y dLL:^2cos4,(C7r) 



= /^{/d^^^Si^([l--«4,](Ci?-) + [l + cos4^](Cn)y, (17) 



where the additional L-mode forms a triangle with ^ and < L </ + /), with cos?7= + - L'^)/2U' 
and similarly for sin r) and Wmm is the angle-average of the modulus squared of the weight function 

Wmm{L) ^ j ^^\Wm{L)\\ (18) 

In the discrete case we can write equations (17) in a compact form using mixing matrices such that 

//yEE\ \ , / TLrEE^mm TtTBB,mm \ / /^EE\ 



where 

j^EE,mm _ ^' S^^ ^ t T , ^ ^ 1 + COS 4r/ 



,„ ALLWmmiL)- . 

(27r)2 ^ ii' sm r] 
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. ^E^LLWr.r.{L)'-j^, (20) 

and similarly for the EB power. These expressions assume that the systematic fields are uncorrelated with the shear 
and intrinsic ellipticity fields. This may not be the case in real data (e.g. selection effects over galaxy populations 
may have particular biases), but for GREATIO selection effects are not investigated and the biases are quoted as 
averages over populations. We leave a generalisation of this formalism to correlated systematic-ellipticity fields 
for future work. 

Using this we can write a power spectrum estimate of the quantities in equation (11) (we drop the angle 
brackets over (pe for clarity from here) including the 7I cross term 

Cr = (l + 2m,)[C7f^'^^ + C™ + Cf^'^^]+^r 

+ Y.^Mff'-^'^lCf,^''^^ + Cf,^'" + Cf,^'-^'] + M,7'— [C.^'^^ + Cf,"'" + Cf,'''-^']) (21) 

where Ai is the angle averaged power spectrum of the c{ff) variation; here, through isotropy, is it assumed that 
that the power contains all relevant information. This could be generalised to include non-isotropic variation in 
all terms i.e. not taking the angle averages, mi is the angle averaged Fourier transform of m(9). Our notation, for 
example ' , refers to the EE power corresponding to correlations between quantities A and B as a function 
of £. We do not include terms from the quadratic q{0) contribution. For GREATIO the 7 field is E-mode only 
and the intrinsic ellipticity field is B-mode only, with no 7I term, so we have a simpler expression 

= (1 + 2m,)Cf ^'^^ + Af" + ^(M^^f '"""C^^^'^^ + Mf/'^^'^Cf,'''"). (22) 

e' 

These expressions are general for any shape measurement biases, and are trivially extendable to include cross-terms 
that may appear in real data (e.g. {cm) cross terms) if required. 

Equation (22) represents the general impact that shape measurement inaccuracies in GREATIO can propagate 
through to the shear power spectrum. In the case that the weight-map is constant {m{0) =constant= mo) (i.e. non- 
istropic) the Fourier transform becomes a deltarfunction and the mixing matrices become M^^^''"'" = In^ x mo = 
M{£) and Mff-"""" = 0. This leads to 

^EE ^ ^EE,-,^ TWCf ^'^^ (23) 

where M = 2mo + mo and where we take a mean value of .4 = (27r/€max)o'(c)^ over £. In general the mixing 
matrices are not only dependent on a single £ (i.e. diagonal M«) except in the case that the systematic is isotropic 
or constant. Unfortunately this is likely not to be the case in weak lensing where for example PSF ellipticity and 
size is often coherent but not constant across a field of view. Massey et al (in prep) will discuss requirements on 
these parameters M and A, and how they relate to uncertainty in PSF parameters. 

We note that this formalism means that we only need to recover the statistical properties of the varying 
m{0) field (the power spectrum and mixing matrix) in order to propagate its impact through to the shear power 
spectrum. In addition, as shown in Appendix B, this formalism can also be used to generate expressions for 
correlation coefficients between the systematic m{9) and c{6) fields and any spatially varying quantity. Given 
these definitions and formalism we can now proceed to outline the metrics used in this paper, taking into account 
some practicalities such as pixel noise removal. 



APPENDIX B: DESCRIPTION OF THE EVALUATION METRICS 

The variable shear nature of the simulations enables a variety of metrics to be calculated, each of which allow us 
to infer different properties of the shape measurement method under scrutiny. In this paper we define a variety 
of metrics that we explain in detail in this Section. 



Bl. Quality factor 

In general for a variable field we define the power spectrum as the Fourier transform of the correlation function 
as described in Appendix A. We wish to compare the power reconstructed from the submissions against the true 
shear power spectrum and so define a baseline evaluation metric, the quality factor (Q), as 

Q = 1000 ^ " . (24) 

/dln£|Cf^-Cf^'T^|£2 ^ ' 

The numerator 5 x 10~® is calculated by generating Monte Carlo realisations of a mock submitted power spectra 
and calculating the bias in the dark energy equation of state parameter wo (Linder, 2003) which would occur if 
such an observation were made (using the functional form filling formalism described in Kitching et al., 2008) 
over a survey of 20, 000 square degrees using the same redshift distribution as described in Section 3.2. In Figure 
11 we show the result of this procedure for GREATIO (where the numerator in equation 24 is labelled as cTsys), 
where we take a threshold value of bias-to-error ratio of 1. This is in fact conservative as shown in Massey et al., 
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Figure 11. Monte Carlo realisations of submitted shear power spectrum where Ugyg is the denominator in equation (24), 
and the calculated bias in dark energy parameter with respect to its error. 



(2012, in prep). The factor of 1000 normalises the metric such that a good method should achieve Q ~ 1000. A 
factor (l/27r) could be included in the denominator, but we absorb this into the factor 5 x 10~^. This was the 
quality factor used in the online leaderboard during the challenge. 



B2. Pixel noise corrected quality factor 

In general we can express the measured total ellipticity by including a noise term in equation (11), where en is 
some inaccuracy in this estimator due to stochastic terms in shape measurement method, or due to pixel noise 
in the images (finite signal-to- noise). In the simulations, for ellipticity catalogue submissions, we averaged over 
A^roaiisation realisations of the noise. In this averaging the mean of the noise contribution is assumed to be zero 
(en) = over realisations, but where there is an error on this mean that remains. By propagating this through to 
the power spectrum we recover 

QBE ^ QBE ^ (25) 

-^^realisation object 

where the noise term is white noise (constant over all scales) with a variance a^, which is a sum of the ei and 

(i'z components. The noise term is now averaged over the number of realisations and the number of objects. 
For values of A^'reaiisation = 200 and A^objcct = 10* the expected fractional contribution to the measured power 

/(Aureal isation-^ object ,e8timatcd )) ^ (<t/0.05)^. 

The measured power spectra inferred from the ellipticity catalogue submissions and used in the quality factor 
(Q) defined in equation (24), therefore includes this noise term. However for an error induced by noise on ellipticity 
estimates of a < 0.05 the impact on the metric should be subdominant. It is commonly assumed that such noise 
terms could be removed in real data (this is trivial for correlation functions, but is more complex for power 
spectrum estimates; that require an estimate of cr„ from data - the full covariance of the shear estimators, see 
also e.g. Schneider et al. 2010), and some power spectrum submissions (see Section 6) did employ techniques to 
remove this term from the submitted power spectrum. Hence we here introduce a quality factor that accounts for 
this noise term 

5 X 10~® 

Qdn = 1000 (26) 

where (<7n) is an estimated value of the pixel noise term from the ellipticity catalogue submissions. 

To estimate the value of {a^) from the simulations we have to separate the E-mode shear field from the 

B-mode only intrinsic ellipticity field, otherwise the variance of the cUipticities from a submitted entry will be 
dominated by the variance of the intrinsic ellipticities. This is done using the rotations described in Appendix 
A, here we describe this pedagogically (we also use explicit Cartesian coordinates 6 — {x,y) and i. = {lx,iy) for 
clarity) . We make a 2D discrete Fourier transform of the submitted ellipticity values such that 

^measure (^x , ) — FT [enicasurc (^, )] (■^'^) 

where here the measured ellipticity is averaged over all noise realisations before transformation. We then rotate 
this field such that 

trot, measure i 

4) = (rr /|r|)en,easure(4, 4) (28) 

and then inverse Fourier transform to real space 
k{x, y) + iP{x, y) = iFTfe 

rot, measure (4,4)] (29) 
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Figure 12. A simulation of the effect on Q (black line) and Q^^ (green line) as the noise in a mock submission (containing 
only noise and the true shear values) increases. Lines at Q = 1000 and <Tn = 0.1 are to guide the reader. 



where we now have a K{x,y) field which contains E-mode power only and a P(x,y) field that contains B-mode 
power only. The simulations have been set up such that the intrinsic ellipticity field has B-mode power only, such 
that we can now take the k{x, y) map and generate an E-mode only ellipticity catalogue that should only contain 
the estimated shear values and the noise term only 

K{x,y) -> Be, measure (3:^,2/) ~ li^.V) + 611(2:, J/), (30) 

where 7 is the estimate shear for each position (object) in field. We do this by following the inverse steps of 
transformations from equations (27) to (29), and assume noise is equally distributed between E and B modes. 
The expression is only approximate because of position dependent biases (see Appendix A and next Section), 
that can mix E and B modes, but for the majority of methods presented in this paper this affect seems to be 
subdominant. By taking the normal variance of CE,moasuro(a;, j/) we find that 

I^E, measure = O"^ + (31) 

and so our estimate of the noise variance is 

~ ^E, measure ■ (^2) 

To calculate this we use the true shear values to find (but note that the true individual shear values are not 
used directly). 

To test that such a correction works we simulated a submission by taking the true shear values and adding 
random normally distributed numbers to each of the 10,000 x 200 x 24 shear values. We show results in Figure 12. 
We find as expected that as the noise increases the value of Q (equation 24) decreases, but that including the noise 
correction (equation 26) increases the value. Note that due to the finite size of the simulations any estimation of 
(Tn is itself noisy which means the corrected value of Qdn < oo even in this ideal case. 



B3. One-point estimator shear relations 

As well as metrics that integrate over the measured power spectra we can also investigate a number of metrics 
that encapsulate a relation between the measured and true shears for individual objects. This ties the quality 
factor metrics to the STEP (Heymans et al., 2006) m and c values where 

7i = (1 + mij)7j + Ci (33) 

where 7* is the true shear and 7^ is the measured shear for each component, this is a simplification of equation 
(11), and that used for all constant shear simulations (with no position dependence). We also add a quadratic 



non-linear term to this relation {q^ - 7ji7ifc?;;i 



1/2 U,|.„l/2 
ij I 

1/2^,, U,l „l/2 



7i = (1 -f mij)7^ +Ci + g./ 7ji7ifc9fci (34) 

that contains 7j7j, not 7^, since we may expect divergent behaviour to more positive and more negative shear 
values for each domain respectively. In general rriij and qij could be non-diagonal matrices, however in this paper 
we assume that they are diagonal and take an average over the two shear components to give 

7 = (l-f m)7* + c + g7|7| (35) 

where all quantities are averaged over 71 and 72. 

In a variable shear simulation calculating m, c and q by regressing fimeasure and (7 -I- eintrinsic) would result 
in a noisy estimator dominated by intrinsic ellipticity noise. However we can calculate m, c and q directly by 
finding the estimated shear for each galaxy individually, removing the intrinsic ellipticity contribution (equation 
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log(Q,,) 



log(Q, 




Figure 13. An exploration of the A), (m, c) and (m, cr{c)) planes, where at each point the quality factor is calculated 
using a noise free fiducial power spectrum. The colour scale shows the logarithm of the quality factor. This can be compared 
to Figure 2. 



30). This is for every galaxy a noisy estimate of the shear, we then average these estimates over bins in 7*. This 
enables the m, c and q parameters to be recovered, and in fact the variable field simulations allows for a flexible 
binning as a function of any other spatially varying quantity (see Appendix E), and an exact removal of shape 
noise (through the B-mode intrinsic power). This method of calculating the m, c and q parameters is a one-point 
estimate of the shape measurement biases and makes no assumption about spatially correlated effects. 



B4. Power spectrum relations 

As described in Appendix A we can write an expression for the estimated power using two linear parameters A4 
and A, taking into account the pixel noise removal we have a similar expression 

= AlCf ^'^^ + A. (36) 

This can be related to the m and c parameters 
M ~ + 2m^2m 

A ~ a{cf (37) 

where a{c) is the variance of the c parameter, but only approximately because of the assumption of some form of 
spatial variation (constant in this case). 

In Figure 13 we show how the Qdn, -M, A and the point estimators m and c are related. To create this we 
explore the (Ml, A) plane and using the fiducial power spectrum calculate Qdn for each value. We also show a 
realisation where random components have been added, M{1 + R) where _R is a uniform random number and 
similarly for A, at each point in parameter space to simulate a more realistic submission. We find that there is 
degenerate line in Qdn where an offset A can be partially cancelled by a negative Ai yielding the same Qdn, and 
a more straightforward relation for > 0. As expected the c parameter does not impact the quality factor but 
the variance of c does. There is a similar degeneracy between m, a{c) and Qdn as with the linear power spectrum 
parameters, this is as expected as in equation (37), except that for large negative m the quadratic term begins 
to become important. 



I 



I'calisation object 
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B5. Correlations with spatially varying quantities 

To relsix the assumption of constant m and c in power spectrum analysis we can assume that each of these is 
correlated with some spatially varying parameter X{0) 

m{e) = mo + aX{e) 

c(6>) = co+pX{e) (38) 

with correlation coefficients a and /3. This is a simple relation and could be made significantly more complex. 

We explain in a correlation function notation how these propagate through, for pedagogical purposes, but 
for the full treatment one should refer to the pseudo-Cl methodology that we present in Appendix A. A simple 
correlation function approximation of the measured shear can be written 

measure intrinsic) n (Cintrinsic) n )] 

+ (2(1 + mo)a{X) + (1 + mo)'')[(77*) + ((eintrinsic)n(eintrinsic);)] + /3^{XX*) (39) 

not including the pixel noise term. We can also take the cross correlation between the measured ellipticity and 
these quantities 

((emeasure)n-'i^*) = (((1 + mo + oX) (7 + (eintrinsic)n) + Co + /3X) X* ) 

= (1 + mo)((7 + (eintrinsic)n)X') + a{X {-j + (ei„t,i„sic) n)X* ) + p{XX*) + co(X*) 

« (l+mo)((7+(eintrinsic)n)X')+,5(XX*)+Co(X') (40) 

which results in an expression that is not dependent on a and assuming that third order correlations and noise-X 
correlations are zero. 

The corresponding full expressions for the pseudo-Cl power spectrum, including the noise correction term 
(which we assume is uncorrelated with all other terms) are 



^ realisation A^ objec 



/- 2 , o \ryEE,j-/ , 2 \^ r i. rEE,XX ^EE,-/~/ , ,,BB,XX^B1 



~,BB,II-: 

ail + me){X)Cf^''''' +fi^Cf 



[(7f ^ - (77^ - C^] = m,(C7^+C/^) + /3Cf^+co(X^). (41) 

The second expression has cross-power spectra on the both sides. The matrices M'^'^ are the mixing matrices for 
the spatially varying quantity X. In general the variation of X is not isotropic - PSF ellipticity for example can 
have a preferred direction in an image - however here we make the assumption of isotropy in defining the power 

To calculate these from the simulations we find the best fitting a and P values (using a minimum least 
squares estimator over the (. range defined in Section 6) for X =PSF size squared and PSF ellipticity. Because 
this calculation is done on sets that are averaged over noise realisations this can only be calculated for those sets 
in which the PSF is fixed for a set (for the PSF correlations). 

The relation to the linear power relations M and A is not straightforward because of the non-diagonal 
mixing matrix in general. Therefore in the results Sections (Section 4) we will quote values for these correlation 
coefficients , 0^.2 , /3e , 5^.2 for ellipticity and PSF size squared (the square of the size is the most relevant 
quantity for propagated PSF-shear behaviour, see Massey et al. 2012, in prep and Paulin-Henriksson et al. 2008). 
Note that a and /? have scaled units of for PSF size correlations this means units of (l/3.4)pixel~^ and 

for ellipticity correlations the quantities are unitless. 



APPENDIX C : SIMULATION MODELLING 

In this Section we provide some further details of the variable shear and PSF field, as well as the local modelling 

of the galaxies and stars. 

CI. Scaling of the sheeir field 

We note that in performing the process of sampling the shear field discretely and then generating a postage stamp 
for each sampling the inter-postage stamp separation in the final image has a distance of Simagc/lOO but this is 
not necessarily related to the pixel scale of the postage stamps i.e. 6'pixei x 48 x 100 7^ ^image in general. As a result 
the number density of the galaxies can be scaled as 

"0 = = ^ (42) 

square arcmin 'iGOOdL^g^ df^^g^ 



and the maximum i set by the grid-separation of the galaxies scales as 
f - n 2 _ 18,000 

^?.age/180/100 - W ^ ^ 
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where 100 is the number of grid positions on a side. But note that the true underlying simulated shear field is 
always fully sampled in every case. 

For the case of ^imagc ~ 10 degrees this gives values of no — 0.0277 and /max = 1800. The images however 
can be scaled to match a variety of other configurations, with the caveat that the absolute value of the shear 
power is constant, Simage = 1 degrees gives a scaling of no = 2.77 and £max = 18,000, and Simage = 0.5 degrees 
gives a scaling of no = 11.1 and /max = 36,000. In each case the absolute amplitude of the calculated shear power 
also needs to be scaled. It is fair to then match the simulations to either of these cases, which span a reasonable 
expected dynamical range in number density of objects but with a coupled increase in the maximum /-range. 



C2. The galaxy models 



Here we describe how the individual galaxies are modelled. Each galaxy is composed of a bulge and a disk defined 
as radial intensity profiles with 



7(r) = /jexp 



l/n 



(44) 

where i^T = 2n — 0.331 with n = 4 for the bulge and n = 1 for the disks and i — {b, d} for bulge and disk. Both 
are Sersic profiles (the second simply a exponential). The intensity is normalised to match the signal-to- noise and 
the scale radii for the disk and bulge, rd and rb respectively, are in general free parameters, fiducial values these 
were set to be rj, = 2.3 and ra = 4.8 pixels. In Bridle et al. (2010), and for the code used for this challenge, the 
value of radii r are the half-light radius for both bulges and disks. The disk exponential scale length and half-light 
scale radii differ by that factor 1.669. 

In most sets the size distribution over objects was a compact Gaussian, with a variance of crjj = 0.01 



p{r) oc exp 



(r - nf 



(45) 



and similarly for the disk distribution. In three sets (see Section 3.1) the galaxy size varied for each galaxy in the 
set, in this case the functional form for the signal-to- noise variation was a Rayleigh distribution 



P(r) oc —exp 



(46) 



where an = 2.0 for these sets, and the ri and are the fiducial values. There is a caveat that the sizes referred to 
here (and in the GREAT08 simulations) refer to the pre-sheared radii of the objects, as such there is a ellipticity- 
sizc correlation that was present in the simulations. 

The bulge and disk in general can be mis-centered, however in all but two sets the bulge and disk profiles 
were co-centered. Object positions were centered in each postage stamp with a Gaussian error position with a 
standard deviation of 0.5 pixels. 

The bulge-to-disk fraction was 50% for the majority of sets i.e. the flux in the bulge and disk was equal. In 
those sets in which this varied we used a uniform distribution of bulge-to-disk ratios over the range b/d = [0.3, 0.95], 
to avoid vcrj' low and very high fractions. 

The bulge and disk components of the galaxies in the simulations had different intrinsic ellipticity distribu- 
tions, each described by 



P,(e) = ecos(^)exp -2 (j-) 



(47) 



where B = 0.09 and C = 0.577 for the bulges and B = 0.19 and C = 0.702 for the disks (these values are taken 

from the APM survey, Crittenden et al. 2001). To remove any very highly elliptical galaxies from the sample 
we truncated this distribution at e = 0.8. This model was slightly more complex than in Bridle et al. (2010) by 
allowing for non-coelliptical profiles (i.e. the bulge and disk were allowed to have different ellipticities). This was 
done so that the ellipticity distributions in equation (47) were conserved. As an example we show the distribution 
of the disk and bulge angles in Figure 14. 

The signal-to-noise was implemented by calculating the noise-free model flux by integrating over the galaxy 
model and then adding a constant Gaussian noise with a variance of unity and rescaling the galaxy model to yield 
the correct signal-to-noise. The signal-to-noise was scaled to match the default SExtractor (Bertin & Arnouts 
1996) f liix_auto/f lux_err_auto parameter combination. The galaxy signal-to-noise distribution was a compact 
Gaussian in the majority of sets, with a variance of as = 0.1, centered on {S/N)i = 20 for the fiducial set 



p{S/N) oc exp 



{S/N-{S/N),r 



2a% 



(48) 



In throe sots (see Section 3.1) the signal to noise varied for each galaxy in the set with a functional form for the 

signal-to-noise variation that was a Rayleigh distribution 



P{SIN) oc ^exp 



{S/N - {S/N)if 



(49) 
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Figure 14. The distributions of bulge and disk ellipticities for a typical image within the fiducial set. Left panels show the 
distribution of ellipticities for bulge and disk. The top right panel shows the uniform distribution of disk position angles, 
and the bottom right panel shows the difference between the bulge and disk positions angles. 



where (S/N)i — 20 and as ~ 5.0 for these sets. 



C3. The PSF models 

The PSF model consisted of a static component that modelled the local PSF functional form and a spatially 
varying kernel that mapped the parameters of this local model across the image plane. The local functional form 
was a Moffat profile 

-13 

(50) 

the scale radius was a variable quantity across each image, related to the FWHM, the power /3 = 3 for all 
images. After generating a circular PSF, it was made into an elliptical shape by distortion using the shear matrix 
given in Kitching et al. (2011) such that there were three parameters which locally describe the PSF (r^, ei, 62). 
Where similarly to the galaxies the size was the pre-sheared size of the PSF. 
The PSF spatial variation consisted of three components 

• Static Component. These were spatially constant across the image and consisted of i) a Gaussian smoothing 
kernel that added to the PSF size, this had a variance of 0.1 present in all images, ii) a static additive ellipticity 
component of 0.05 in ei,psF and e2,psp to simulate tracking error. 

• Deterministic Component. This was to simulate the impact of the telescope on the PSF size and ellipticity. 
We used the Jarvis, Schecter and Jain (2008) model to simulate this with fiducial parameters (ao — 0.014, 
ffli = 0.0005, do ~ —0.006, di = 0.001, co = —0.010), which is dominated by primary astigmatism (ao), primary 
de-focus (do) and coma (co). 

• Random Component. To simulate the random turbulent effect of the atmosphere in some of the sets 
we additionally included a random Gaussian field in the ellipticity only with a Kolmogorov power spectrum of 
Ci = (see Rowe, 2010 and Heymans et al., 2012 for discussion on this kind of power spectrum PSF variation 
seen in optical weak lensing images). 

In Figure 15 we show a typical PSF pattern for an image in a set with no random Kolmogorov variation and one 
in which there is a random Kolmogorov component. As described in Section 2 participants were provided with 
the PSF as an exact functional form, consisting of tabulated numbers for (ra, ei, 62) at the position of each galaxy 
and as a pixelated stellar image. 




APPENDIX D : SET DESCRIPTION 

In the Table below we provide the parameter values that define each set in the GREATIO Galaxy Challenge 
simulations. 
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Figure 15. Each panel shows an entire simulated image, showing the typical PSF pattern for an image in a set (image 
100 in set 1) with no random Kolmogorov component (upper panels) and for an image in a set (image 100 in set 19) with 
a random Kolomogorov component (lower panels). The 100x100 grid has been downsampled to 30x30 in these panels for 
clarity. The left panels show the amplitude of the ellipticity in the colour scale, and the orientation of the PSF denoted by 
the whiskers. The right hand panels show the size of the PSF in the colour scale in unit of pixels. In each image in a set 
these patterns changed, except in those sets where the PSF spatial variation was fixed (see Appendix D). 
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Set Name 


Fixed 


S/N 


S/N Dist. 


rt/pix. 


rd/pix. 


B/D Fraction 


B-D Offset/pix.2 


r Dist. 


KM Power 


1 


Fiducial 




20 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


2 


Fiducial 


PSF 


20 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


3 


Fiducial 


Int 


20 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


4 


Low S/N 




10 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


5 


Low S/N 


PSF 


10 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


6 


Low S/N 


Int 


10 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


7 


High S/N 




40 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


8 


High S /N 


PSF 


40 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


9 


High S /N 


Int 


40 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


10 


Smooth S/N 




20 


Rayleigh 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


11 


Smooth S/N 


PSF 


20 


R&ylGigh 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


12 


Smooth S/N 


Int 


20 


Rayleigh 


2.3 


4.8 


0.5 


0.0 


Gaussian 


None 


13 


Small Galaxy 




20 


Gaussian 


1.8 


2.6 


0.5 


0.0 


Gaussian 


None 


14 


Small Galaxy 


PSF 


20 


Gaussian 


1.8 


2.6 


0.5 


0.0 


Gaussian 


None 


15 


Large Galaxy 




20 


Gaussian 


3.4 


10.0 


0.5 


0.0 


Gaussian 


None 


16 


Larg6 Galaxy 


PSF 


20 


Gaussian 


3.4 


10.0 


0.5 


0.0 


Gaussian 


None 


17 


Smooth Galaxy 




20 


Gaussian 


2.3 


4.8 


0.5 


0.0 


R.£iyleigh 


None 


18 


Smooth Galaxy 


PSF 


20 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Rayleigh 


None 


19 


Kolmogorov 




20 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


Yes 


20 


Kolmogorov 


PSF 


20 


Gaussian 


2.3 


4.8 


0.5 


0.0 


Gaussian 


Yes 


21 


Uniform b/d 




20 


Gaussian 


2.3 


4.8 


[0.3,0.95] 


0.0 


Gaussian 


None 


22 


Uniform b/d 


PSF 


20 


Gaussian 


2.3 


4.8 


[0.3,0.95] 


0.0 


Gaussian 


None 


23 


Offset b/d 




20 


Gaussian 


2.3 


4.8 


0.5 


0.5 


Gaussian 


None 


24 


Offset b/d 


PSF 


20 


Gaussian 


2.3 


4.8 


0.5 


0.5 


Gaussian 


None 



Table 5. A summary of the variables that define each set in the GREATIO Galaxy Challenge simulations. The variables in bold are those that distinguish each set from the fiducial one. The 
third columns lists those fields that were fixed over each image in each set. Columns 4 and 9 list the distribution used for the signal-to-noise and galaxy sizes respectively. Column 8 shows the 
variance of the offset between the bulge and disk components in pixels squared. 
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APPENDIX E : DESCRIPTION OF THE METHODS 

Here we briefly summarise the methods that took part in the chaUenge. We encourage the reader to refer to the 
methods' own papers for more details. 

For each method we show 3 figures these are 

(i) A reconstruction of the shear power spectrum for each set comparing the submitted power, true power and 
pixel noise corrected power, and the M., A and Qdn values for all sets. 

(ii) The measured minus true shear on an objcct-by-objcct basis as a function of the true shear 7*, the PSF 
ellipticity and size, the bulge-to-disk angle and fraction and the bulge size; for 7* the gradient and offset of this 
fit is are m and c, in all cases we make 10 bins the variable quantity. We also show a value for q, a non-linear 
shear response for each metric keeping m and c fixed at their best fit values (see equation 35). 

(iii) The m and c values as a function of PSF ellipticity and size, the bulge-to-disk angle and fraction and the 
bulge size. In all cases we make 10 bins the variable quantity. 

Because these figures contain a wealth of information for the latter two we plot the gradient and offset values for 
a linear fit through the points and display these values in the figures. In the top righthand corner of each of the 
subplots we show the difference in the reduced between the best linear fit and the best constant fit (gradient 
equal to zero) Ax^ = x^ (gradient, offset) — x^ (offset); this can be used as an indicator of the significance of any 
linearly varying behaviour. 

For power spectrum submissions the later two plots (concerned with individual one-point shear biases) will 
not be shown. 
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Figure 16. The true shear power (green) for each set and the shear power for the 'ARES 50/50' submission (red), we also 
show the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line 
is only legible). The y-axes are Cgl'^ and the x-axis is £. In the bottom righthand corner we show the A4/2, \/~A and the 
colour scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

El. ARES : Peter Melchior 

Comparing the results of DEIMOS and KSB, we found several sets where the eUipticities measured with either 
method strongly and consistently disagreed, with relative deviations of up to 25%. With additional simulations 
we investigated when such discrepancies between KSB and DEIMOS occur, and concluded that mainly very 
small, i.e. badly resolved, galaxies are responsible for large relative deviations, with KSB having a too weak and 
DEIMOS a too strong response to galactic eUipticities. Hence, a linear combination of the shear estimates of KSB 
and DEIMOS appeared advantageous. With the results of our simulations, a weighting scheme was defined that 
aims to minimise the mean squared error on the ellipticity of each galaxy. For GREATIO, the weight for each set 
was adjusted independently. 
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m=-0.026483+-0.0034271 
q=0.01051+-0.1484 
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c=3.4958e-05+-5.1 374e-05 
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-0.05 
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Figure 17. The measured minus true shear for the 'ARES 50/50' submission as a function of the true shear, PSF ellipticity, 
PSF FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we 
fit a linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show Ax^ = (gradient, offset) — 
(offset). 
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Figure 18. The STEP m and c values for the 'ARES 50/50' submission as a function of PSF FHWM and ellipticity, galaxy 
bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation 
to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c 
are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax'^ = x^ (gradient, offset) — x^(offset). 
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Figure 19. The true shear power (green) for each set and the shear power for the 'cat2-unfold' submission (red), we also 
show the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line 
is only legible). The y-axes are C(fi and the x-axis is I. In the bottom righthand corner we show the M./2, \/~A and the 
colour scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 



E2. cat-unfold: David Kirkby, Daniel Margala 

See fit-unfold description. 
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M/2 

Figure 20. The true shear power (green) for each set and the shear power for the 'DEIMOS C6' submission (red), we also 
show the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line 
is only legible). The y-axes are C(£^ and the x-axis is £. In the bottom righthand corner we show the A4/2, \fA- and the 
colour scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

E3. DEIMOS : Peter Melchior, Massimo Viola, Julia Young, Kenneth Patton 

DEIMOS (Melchior et al., 2011) measures the second-order moments of the light distribution using an elliptical 
Gaussian weight function, whose width is adjusted such as to maximise the S/N of the measurement. The centroid 
of the galaxy and ellipticity of the weight function is iteratively matched to the apparent (i.e. PSF-convolved) 
galaxy (the method has first been described by Bernstein & Jarvis, 2002). The application of the weight function to 
the image is then corrected by considering higher-order moments. These corrections become increasingly accurate 
with increasing width of the weight function, or the correction order. For GREATIO we used correction order of 
4 to 8, i.e. considering the effect of weighting on the moments of order 6 to 10. This correction scheme has been 
shown to introduce very small biases on the order of 1%, mostly for very small galaxies. After the deweighting, 
we deconvolve the galactic moments from the moments of the PSF, for which we have established an exact and 
analytic approach. The PSF has been measured with a weight function of the same width as the galaxy, but the 
ellipticity of the weight function was allowed to match the ellipticity of the PSF. From the deconvolved moments 
we determine the complex ellipticity e, which theoretically provides an unbiased estimator of the gravitational 
shear and thus does not need any susceptibility or responsivity corrections. 

The only free parameter is the choice of the correction order, which we varied from 4 to 8 (e.g. "DEIMOS 
C6"), and the range of weight function widths. No model of either galaxy or PSF is employed. The pixel values 
are taken at center-pixel positions, an interpolation to sub-pixel resolution is not applied. 
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Figure 21. The measured minus true shear for the 'DEIMOS C6' submission as a function of the true shear, PSF ellipticity, 
PSF FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we 
fit a linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show Ax^ = (gi'^'dient, offset) — 
X^(offsct). 
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Figure 22. The STEP m and c values for the 'DEIMOS C6' submission as a function of PSF FHWM and ellipticity, galaxy 
bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation 
to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c 
are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax^ = x^ (gi'^dient, offset) — x^(offset). 

© 2012 RAS, MNRAS 000, ??-?? 



GREAT 10 Galaxy Ghallenge 33 



10 



10 
10~' 



o 



10 



10 



1 Fiducial 

Submitted 

True 




6SN=10(fixedlA) 




1 1 Smooth SN (fixed) PSF 




16 R/Rp=1. 41 (fixed PSF) 




22 Uni b/d (fixed PSF) 




2 Fiduciai (fixed PSF) 




7 SN=40 (training) 



12 Smooth SN (fixed) iA 





20 Kt^ PSF (fixed PSF) 



3 Fiduciai (fixed iA) 





8 SN=40 (fixed PSF) 9 SN=40 (fixed IA) 



14 R/Rp=0.53(fixed PSF) 





5SN=10 (fixed PSF) 




A. 





18 Smooth R (fixed PSF) 





24 Offset b/d(fixed PSF) 



3 2 
10 10 



3 2 

10 10 



l-mo(de 



1 

0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 


-0, 



log(Q,„) 
i 



.,,100 200 



300 400 500 

• 5 

13* 



3 -0.2 -0.1 0.1 0.2 0.3 

M/2 



Figure 23. The true shear power (green) for each set and the shear power for the 'fit2-unfold' submission (red), we also 
show the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line 
is only legible). The y-axes are C(£^ and the x-axis is £. In the bottom righthand corner we show the A4/2, \fA- and the 
colour scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 



E4. fit-unfold, cat-unfold, shapefit : David Kirkby, Daniel Margala 

Each of these names refer to different submissions from the same underlying software, fit-unfold and cat-unfold 
were power spectrum submissions. The DeepZot analysis pipeline consists of four layers of software, implemented 
as C-I--I- libraries, that were used for both the GREATIO Galaxy Challenge and the MDM Challenge (Kitching 
et al. in prep). The first layer provides a uniform interface to the GREATIO and MDM datasets. The next 
layer performs PSF and galaxy shape estimation using a maximum likelihood model-fitting method. A half-trace 
approximation KSB method is also implemented for comparison with earlier work and to provide a fast bootstrap 
of the model fit. The model-fitting code incorporates an optimised image synthesis engine and uses the MINUIT 
minimisation library to calculate full covariance matrices. The third layer provides supervised machine learning 
when a suitable training set is available, and is based on the TMVA package. The best results in the MDM 
Challenge were obtained with a 13-input neural network that derives ellipticity corrections from a combination 
of model-fitted parameters, covariance matrix elements, and KSB results. The final layer of the DeepZot software 
pipeline performs power-spectrum estimation and uses the model-fit errors to determine and subtract the variance 
due to shape measurement errors. The main computational bottleneck in the DeepZot pipeline is the model fit, 
that currently requires about 500ms per galaxy on a single Intel Xeon core for a typical fit to a 19-parameter 
galaxy model in which seven parameters are fioating and a full covariance matrix is obtained. 
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Figure 24. The true shear power (green) for each set and the shear power for the 'gfit' submission (red), we also show the 
'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line is only 
legible). The y-axes are Cgl'^ and the x-axis is £. In the bottom righthand corner we show the Ai/2, %/A and the colour 
scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

E5. gfit : Marc Gentile, Frederic Courbin, Guldariya Nurbaeva 

The gfit shear measurement method is a simple forward model fitting method where the underlying galaxy is 
modelled using a 7-parameter Sersic profile. The model parameters are the Sersic index and radius (n, r^), the 
galaxy 2-component ellipticity (61,62), the centroid (xc,yc) and the fiux intensity (7o) at r = 0. The galaxy and 
PSF centroids were estimated using SExtractor (Bertin & Arnouts, 1996). 

For GREATIO, gfit used a different minimiser than that based on Levenberg-Marquardt previously used in 
GREAT08. The minimiser was developed at the Laboratory of Astrophysics of EPFL (LASTRO) with GREATIO 
in mind. It has proven more robust and more accurate when fitting low SNR images. 

The 'gfit den cs' version of gfit submitted in GREATIO involved an experimental implementation of the new 
DWT-Wiener wavelet-based denoising method, also developed at LASTRO. DWT- Wiener proved very successful 
in all other methods we submitted in the Galaxy challenge (TVNN, MegaLUT). In the case of gfit, the Q factor 
was boosted by an estimated factor of 1.5. More details about the DWT-Wiener method can be found in Nurbaeva, 
Courbin et al., (2011). 
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Figure 25. The measured minus true shear for the 'gfit' submission as a function of the true shear, PSF ellipticity, PSF 
FWHM, galaxy bulgc-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we fit a 
linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show Ax'^ = (gradient, offset) — 
(offset). 
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Figure 26. The STEP m and c values for the 'gfit' submission as a function of PSF FHWM and ellipticity, galaxy bulge- 
to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation to 
the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c are 
~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax^ = x^ (gradient, offset) — x^(offset). 
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Figure 27. The true shear power (green) for each set and the shear power for the 'imSshape NCBO' submission (red), we 
also show the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red 
line is only legible). The y-axes are Cgi'^ and the x-axis is £. In the bottom righthand corner we show the M/2, \fA- and 
the colour scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

E6. imSshape : Sarah Bridle, Tomasz Kacprzak, Barney Rowe, Lisa Voigt, Joe Zuntz 

imSshape fitted a sum of co-elliptical and co-centered Sersic profiles. In this implementation two Sersic profiles 
were used with the Sersic indices fixed to be 1 (disk-like) and 4 (bulge-like) and a bulge to disk scale radius ratio 
set to 0.9. The functional form for the PSF was provided, and the convolution was performed on a grid three 
times the pixel resolution in each direction, with additional integration in the central pixels of the galaxy model 
image. The maximum likelihood point was used, with a evaluated from the full 48x48 postage stamp. The 
output ellipticity (a — 6) /(a -I- 6) was used as our shear estimate, but with a correction for noise bias for the 
submissions marked "NBC". For the noise bias correction a noisy simulated image was produced of a fiducial 
galaxy using the machinery in the imSshape code. Simulations were also produced in which the ellipticity was 
increased by 0.1 in one or other direction. A straight line was fitted to the output shear estimates relative to 
the input ellipticity to measure multiplicative and additive errors and it was verified that the multiplicative and 
additive errors were zero in the absence of noise. For submissions marked "NBCO" two different kinds of noisy 
simulations were performed and used these to correct the shear estimates of the corresponding GREATIO image 
sets for (i) MoflFat PSF and fiducial GREATIO SNR (u) Moffat PSF and lowest GREATIO SNR. For NBCl the 
following combinations were used (i) Moffat PSF, fiducial GREATIO SNR, PSF FWHM 3.3 pixels, bulge scale 
radius 4.3 pixels (u) as previous but PSF FWHM 3.1 (iu) as previous but PSF FWHM 3.6 (iv) Moffat PSF, 
fiducial GREATIO SNR, PSF FWHM 3.3 pixels, bulge scale radius 2.3 pixels (v) as previous but bulge scale 
radius 8 pixels (vi) Moffat PSF, low GREATIO SNR, PSF FWHM 3.3 pixels, bulge scale radius 4.3 pixels (vu) as 
previous but PSF FWHM 3.1 (viii) as previous but PSF FWHM 3.6. The optimiser used to find the location of 
maximum likelihood in the model parameter space was "PRAXIS" (short for Principal AXIS) by Richard Brent, 
that is freely available from Netlib at http://www.netlib.org/opt/. The code is specifically written to make it 
easy to interchange optimisers and alternatives are also under investigation. For more information please refer to 
Zuntz et al. (in prep) for details about the imSshape code in general and Kacprzak et al. (in prep) for details of 
the noise bias calibration. 
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Figure 28. The measured minus true shear for the 'imSshape NCBO' submission as a function of the true shear, PSF 
eUipticity, PSF FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each 
dependency we fit a linear function with a gradient and offset, for the top left hand panel this is the STEP m and c 
values, additionally for the shear dependency we include a quadratic term separately q. The top right hand corners show 
Ax^ = X'^ (gradient, offset) — x^(ofEset). 
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Figure 29. The STEP m and c values for the 'imSshape NCBO' submission as a function of PSF FHWM and eUipticity, 
galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear 
relation to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on 
m and c are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax^ = x^ (gradient, offset) — x^(offset). 
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Figure 30. The true shear power (green) for each set and the shear power for the 'KSB' submission (red), we also show 
the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line is only 
legible). The y-axes are C^l'^ and the x-axis is i. In the bottom righthand corner we show the Ai/2, %/A and the colour 
scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 



E7. KSB : Julia Young, Peter Melchior 

The original KSB approach was implemented with the 'trace-trick', where the inversion of P"™ is achieved by 
replacing the entire 2x2 matrix by 1/2 of its trace. This approach is employed in several studies, and it has 
recently recently been shown (Viola et al., 2011) that is provides the most unbiased shear estimates for a variety 
of observational condition. 

To determine galaxy centroid and the width of the circular Gaussian weight function, the same iterative 
method employed in DEIMOS was used: determine the centroid such that the first moments vanish, and the size 
of the weight function such as to maximise S/N. For the final shear estimate, we did not apply additional fudge 
factors or responsivity corrections. 
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Figure 31. The measured minus true shear for the 'KSB' submission as a function of the true shear, PSF eUipticity, PSF 
FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we fit a 
linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show = (gradient, offset) — 
(offset). 
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Figure 32. The STEP m and c values for the 'KSB' submission as a function of PSF FHWM and eUipticity, galaxy 
bulgc-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation 
to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c 
are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax^ = (gradient, offset) — x^(offset). 
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Figure 33. The true shear power (green) for each set and the shear power for the 'KSB f90' submission (red), we also show 
the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line is only 
legible). The y-axes are Cii^ and the x-axis is i. In the bottom righthand corner we show the Ai/2, %/A and the colour 
scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

E8. KSB f90 : Catherine Heymans 

KSB f90 is a benchmark implementation of the longstanding KSB+ method (Kaiser, Squires & Broadhurst 
1995, Luppino & Kaiser 1996 and Hoekstra et al 1998). This code is identical to that used in the 'CH' anal- 
ysis of STEPl and GREAT08 (Heymans et al 2006a, Bridle et al 2010) and can therefore be viewed as a 
benchmark to compare the different simulations. KSB f90 is publicly available and can be downloaded from 
http://www.roe.ac.uk/~heyinans/KSBf90. The code has been used to analyse the GEMS and STAGES HST 
surveys (Heymans et al 2005, Heymans et al 2008). The accuracy of KSB f90 has a strong S/N dependence as 
shown in this paper yielding an incorrect redshift scaling of the lensing signal in real data. For this reason, whilst 
KSB f90 has been shown to perform well on average and for signal-to-noise > 20, author C. Heymans advises not 
to use this shape measurement method for low signal-to-noise data. 
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Figure 34. The measured minus true shear for the 'KSB f90' submission as a function of the true shear, PSF ellipticity, PSF 
FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we fit a 
linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show Ax^ = (gradient, offset) — 
x2(offset). 
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Figure 35. The STEP m and c values for the 'KSB f90' submission as a function of PSF FFIWM and ellipticity, galaxy 
bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation 
to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c 
are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax'^ = x^ (gradient, offset) — x^(offset). 
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Figure 36. The true shear power (green) for each set and the shear power for the 'MegaLUTsim2.1 b20' submission (red), 
we also show the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a 
red line is only legible). The y-axes are Ci£^ and the x-axis is I. In the bottom righthand corner we show the M/2, \/~A and 
the colour scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

E9. MegaLUT : Malte Tewes, Nicolas Cantale, Frederic Courbin 

MegaLUT is a fast empirical method to correct ellipticity measurements of galaxies for the distortions by the 
PSF. It uses a straightforward classification scheme, namely a lookup table (LUT), built by supervised learning. 
In the scope of our submissions to GREATIO, the successive steps of MegaLUT can be summarised as follows: 
1. Simulate a large number of realistic galaxy and PSF stamps and store the sheared galaxy ellipticities prior to 
the PSF convolution. This leads to a learning sample of images. 2. Run a shape measurement algorithm on the 
galaxies and PSFs of this learning sample and create a lookup table that connects the measured galaxy and PSF 
shapes to the known galaxy ellipticities stored in the first step. 3. For a given galaxy/PSF pair in the GREATIO 
data, run the same shape measurement algorithms as in step 2. Query the lookup table to identify the galaxy/PSF 
pairs of the learning sample that have similar measured shapes. The galaxy ellipticities of these selected pairs, 
as stored at step 1, yield our estimate of the galaxy ellipticity prior to the convolution by the PSF. The complex 
problem of PSF correction is therefore reduced to a simple and fast array indexing operation. 

For the final submission 'MegaLUTsim2.1 b20', we denoised the galaxy and PSF images with wavelet filtering, 
and built simple threshold masks. The shapes were then measured using second order moments of the masked 
light distributions. The lookup table was generated from 2.1 million simulated galaxy/PSF pairs. 
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Figure 37. The measured minus true sliear for tlie 'MegaLUTsim2.1 b20' submission as a function of tlie true sliear, 
PSF oUipticity, PSF FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For 
each dependency we fit a linear function with a gradient and offset, for the top left hand panel this is the STEP m and c 
values, additionally for the shear dependency we include a quadratic term separately q. The top right hand corners show 
Ax'^ = (gradient, offset) — x^(offsct). 
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Figure 38. The STEP m and c values for the 'MegaLUTsim2.1 b20' submission as a function of PSF FHWM and ellipticity, 
galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear 
relation to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on 
m and c are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax^ = x^ (gradient, offset) — x^(offsct). 
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Figure 39. The true shear power (green) for each set and the shear power for the 'method 4' submission (red), we also 
show the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line 
is only legible). The y-axes are C(fi and the x-axis is I. In the bottom righthand corner we show the M./2, \/~A and the 
colour scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

ElO. method4,5,7 : Micheal Hirsch, Stefan Harmeling 

In a series of submissions named methodOx with x G {1, .., 7} the effect of taking higher order pixel correlations 
on the accuracy of shear measurement was tested. In methodOl the shear was measured by subtracting the 
quadrupole moments of the auto-correlated images of the galaxy and corresponding PSF images. The assumption 
of uncorrelated noise is confirmed by the fact that the auto-correlation is highly peaked at zero shift. To get rid 
of this peak which impedes accurate moment estimation, a rough estimate of the noise variance was obtained by 
computing the variance of pixels with negative intensity values only (assuming Gaussian noise with zero mean) 
which was then subtracted from the central pixel. As in any other KSB-type method, noise affects moment 
estimation and has to be accounted for by some weighting scheme. To this end both galaxy and star images 
were modulated by a Gaussian with fixed variance and zero centroid. By noticing that a pixel-wise modulation 
corresponds to a convolution in Fourier space, a correction for the induced error due to the modulation could be 
removed by subtracting the measured quadrupole moment and the fixed variance of the Gaussian distribution used 
for weighting in the Fourier domain. In method04, we went one step further by computing the auto-correlation 
of the auto-correlated galaxy or star image, otherwise pursuing the the same approach as described above. By 
this the images are even further smoothed and are still centered such that inaccuracies in centroid estimation 
are not an issue in our approach. All other methods are variants of the above where the empirical moment 
estimation with a Gaussian weighting scheme was replaced by a model fitting approach (method02), introduced an 
additional denoising step (methodOS), did empirical moment estimation without additional weighting (methodOS) 
and accounted for the PSF by a Wiener deconvolution of the galaxy images before moment estimation (methodOT). 
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Figure 40. The measured minus true shear for the 'method 4' submission as a function of the true shear, PSF ellipticity, PSF 
FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we fit a 
linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show Ax^ = (gradient, offset) — 
(offset). 



0.2 



-0.2 



0.2 



-0.2 



0.2 


-0.2 



0.2 



-0.2 



m=(-0.13566)PSF size +(0.25895) 



0.059 



3.2 



3.4 3.6 

PSF Size/pixels 



m=(1 .533)PSF e +(-0.09935) 



3.8 



0.423 




■m=(-0.0006659)b-d angle +(-0.10509) 


0.097 


ffi .0 ^rr, m 




li) --^ or-O 



-80 -60 -40 -20 20 40 60 80 
b-d angie 



m=(0.33938)b-d trac +(-0.31222) 



1.958 
OS) 



0.3 



0.4 



0.5 



0.6 0.7 
b-d trac 



0.2 



-0.2 



m=(0.01 1 1 63)b size +(-0.141 36) 



0.8 0.9 



0.114 



-©■ 



« — ® © 0) ® O o 



-®- 



3 

b size 



X 10 



" 
-5 



C=(-0.00032042)PSF size +(0.0010646) 

— (^) (1) <I> (I) 



3 3.2 

-3 

X 10 



3.4 3.6 
PSF Size/pixeis 



o 



c=(-0.001 1805)PSF e +(-5.343e-06) 



3.8 



0.003 



c[)(|)(|)C ' (I)(f)(^(I) 



-0.1 



X 10 



-0.05 





PSFe 



0.05 



0.1 



" 
-5 



c={1 .0024e-06)b-d angle j-(-0. 0001 2689) 0.068 



— r 



-80 -60 -40 -20 20 40 60 80 
b-d angie 



" 
-5 



X 10 

c=(-0.00p63859)b-d trac +(0.00039698) 



0.03 



^ ^ ^ ^ ^ ^ 



0.3 0.4 

-3 



0.5 



0.6 0.7 
b-d trac 



0.8 0.9 



X 10 



5 

" 
-5 



c=(-2.9374e-05)b size +(9.21 85e-05) 0.007 



3 

b size 



Figure 41. The STEP m and c values for the 'method 4' submission as a function of PSF FHWM and ellipticity, galaxy 
bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation 
to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c 
are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax'^ = x^ (gradient, offset) — x^(offsot). 
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Figure 42. The true shear power (green) for each set and the shear power for the 'shapefit' submission (red), we also show 
the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line is only 
legible). The y-axes are C^l'^ and the x-axis is £. In the bottom righthand corner we show the Ai/2, %/A and the colour 
scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 

Ell. shapefit: David Kirkby, Daniel Margala 

See fit-unfold description. 
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Figure 43. The measured minus true shear for the 'shapefit' submission as a function of the true shear, PSF eUipticity, PSF 
FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we fit a 
linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show Ax'^ = (gradient, offset) — 
(offset). 
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Figure 44. The STEP m and c values for the 'shapefit' submission as a function of PSF FHWM and eUipticity, galaxy 
bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation 
to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c 
are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax^ = (gradient, offset) — x^(offset). 
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Figure 45. The true shear power (green) for each set and the shear power for the 'NN23' submission (red), we also show 
the 'denoised' power spectrum (blue) for each set (where this is indistinguishable from the raw submission a red line is only 
legible). The y-axes are C(l? and the x-axis is In the bottom righthand corner we show the M/2, \/~A and the colour 
scale represents the logarithm of the quality factor. The small numbers next to each point label the set number. 



E12. TVNN: Guldariya Nurbaeva, Frederic Courbin, Malte Tewes, Marc Gentile 

The methods NN23 func, NN19 and NN21, submitted to GREATIO, were variants of the Total Variation Neural 
Network (TVNN) method, that is a deconvolution technique based on the combination of a Hopfield neural 
network (Hopfield, 1982) with the Total Variation model proposed by Rudin, Osher and Faterni (Rudin, 1992). 
In the Total Variation model, the noise in the image is assumed to follow a Gaussian distribution. 

The deconvolution process is carried out by minimising the energy function of the Hopfield Neural Network. 
This energy function is composed of the PSF, expressed as a Toeplitz matrix, and of a regularisation term to 
minimise the noise. The latter is a Sobel high-pass operator. The deconvolution itself is done in an iterative way 
where at each step, the neurons of the network are updated so as to minimise the energy function. 

Galaxy ellipticities are then estimated from quadrupole moments computed on the 2D auto-correlation func- 
tion (ACF) of the deconvolved image. The advantages of using the ACF are 1- high signal-to-noise shape mea- 
surement, 2- invariance of the ellipticity measurement with respect to data (Waerbeke, 1997; Miralda-Escude, 
1991) 

In our submissions, the number after the acronym NN stands for the size of the input data stamps, i.e., NN23 
considers images with 23 pixels on a side. This is the first time fuU-deconvolution of the data is used to carry out 
shape measurements. 
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Figure 46. The measured minus true shear for the 'NN23' submission as a function of the true shear, PSF ellipticity, PSF 
FWHM, galaxy bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each dependency we fit a 
linear function with a gradient and offset, for the top left hand panel this is the STEP m and c values, additionally for the 
shear dependency we include a quadratic term separately q. The top right hand corners show Ax^ = (gradient, offset) — 
(offset). 
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Figure 47. The STEP m and c values for the 'NN23' submission as a function of PSF FHWM and ellipticity, galaxy 
bulge-to-disk offset angle, galaxy bulge-to-disk fraction and galaxy bulge size. For each variable we plot the a linear relation 
to the behaviour of m and c. We do not explicitly quote errors on all parameters for clarity, the average errors on m and c 
are ~ 0.005 and 5 X 10~^ respectively. The top right hand corners show Ax'^ = x^ (gradient, offset) — x^(offset). 

© 2012 RAS, MNRAS 000, ??-?? 



50 T. D. Kitching et al. 



APPENDIX F: SIMULATIONS 

Inevitably, with a simulation the size of the GREATIO Galaxy Challenge, there were several points in which the 
data or interpretation of the data/competition instructions were inadvertently misinterpreted by participants. We 
list these here: 

(i) Approximately 1% of the data were found to contain image glitches and were replaced during the challenge 
as a patch to the data. 

(ii) The functional PSFs used a convention in {x,y) coordinate and ellipticity for which some methods had to 
make the following transformations 62 — ^ —62, x ^ y and «/ — > a;, rpsp — > r"psF/(l + ef + el). This convention 
warning was listed in the header of every functional PSF description during the challenge. 

(iii) An additional two sets contained "pseudo-Airy" PSFs using the functional form of Kuijken (2006). However 
there was a misinterpretation by some participants between the functional PSF description and the PSF FITS 
images generated using the photon-shooting method used in the GREATIO code. This arose because in the 
photon-shooting method photons at large r are generated using a uniform distribution from to 1 and then their 
values replaced by a reciprocal; but the PDF of such a process yields a variation of not 1/r, that when 
modulated by the function gives 1/r* (not given in equation 21, Kuijken, 2006; the same equation that was 
provided to participants). This was identified during the challenge and all participants were informed, and the 
code used to produce the PSFs made public^ on 7^^ February 2011 (7 months before the challenge deadline), 
however we have not included the results from these sets in this paper because several submissions were aS'ected. 

Each of these issues were addressed during the challenge, however the nature of the participation rate (see Section 
5, all submissions were made in the final 3 weeks) meant that some methods did not have time to create alternative 
submissions before the official challenged closed. The challenge was extended by one week, into a post-challenge 
submission period, but those methods submitted during this time could not officially 'win' the competition, in 
the event none of these additional submissions improved on the winning score 

When using the GREAT08/GREAT10 code we note a number of issues that should be taken into account in 
its description in Bridle et al., (2010). The signal to noise used in Bridle et al., (2010) is approximatley half the 
standard definition used in this article. Equation (A8) makes the area of the galaxy invariant under the primary 
ellipticity transformation (but not under the cosmological shear transformation), whereas equation (A9) does not 
make the PSF area invariant under the ellipticity transformation. Also the sense of the transformation in these 
equations of g for galaxies and e for PSFs is different; the PSF shear is in the opposite direction to the cosmic 
shear. Finally, we also note that there were two typos in Appendix A of Bridle et al., (2010). These were 1) in 
equation (A5) the left top corner of the matrix should be r/^/{q) and 2) equation (A8) should be the transpose 
of which it reads. 



http : //great . roe . ac . uk/data/code/ sm/ 
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